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Why C? Why the Instant Guide? 


Since the proliferation of computer languages nearly a quarter of a century ago, 
theres only one that's still moving from strength to strength. Combining unrivaled 
flexibility and power, the C language remains the ultimate gateway to the 
programming world. As modern implementations of C continue to develop and 
push back programming frontiers, every programmer who isn't C-worthy needs a 
quick route into original C in order to keep up. Whether you're bandy-legged in 
BASIC, cock-a-hoop in COBOL or sitting pretty in PASCAL, this book should 
interest you. 


As with all our Instant series, our aim is to produce a thorough yet fast-paced 
guide to a popular and developing language. All the language’s major topics, 
concepts and constructs, including key programming elements such as structures, 
arrays and pointers are dealt with in detail in the first half of this book. In the 
second half we focus on particularly powerful and involved concepts, such as 
input/output, the pre-processor and system specifics, at a level that'll leave you 
feeling confident in the C environment. Finally, we round off with a complete 
tutorial showing you how to write a proper C application which will illustrate 
how many C elements integrate into a whole. We hope that Instant C will act as 
your critical C reference guide, becoming your key to a unique and exciting 
programming world. 
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Introduction 





Welcome to the latest Instant Guide from Wrox Press, Instant C. This book 
has been designed as a guide to learning one of the most popular 
programming languages ever created. Introducing everything you need to 
know in order to become a capable and knowledgeable programmer of the 
C language, this book will enable you to understand the concepts involved 
and the essential techniques. Because we at Wrox Press believe that our 
readers are intelligent creatures, we won't waste time by patronizing you 
with over-simplistic examples. We will be introducing new practical advice 
and a fresh approach to many exciting techniques from some of the 
industry's finest developers. 


What Can C Do for Me? 


Most people, even those with little computer experience, will have come 
across the C programming language. For decades, C has enjoyed a 
dominance rivaled by very few other languages, and market status that still 
demands a great deal of respect today. In fact, many new pretenders to the 
throne of global domination that C achieved actually use C as a subset of 
their whole system, effectively making it an obligatory language to learn. 
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Who Should Use This Book? 


Instant C contains the information necessary to transform the reader from a 
person with knowledge of a different computer language, to one who is a 

competent C programmer with the ability to write efficient, well-structured 
and clear code. The types of people likely to benefit from this book are: 


p’ The programmer who wants to move up to the industry standard 
usually from BASIC, Pascal or COBOL. 


"WE The C programmer who needs a quick revision on the C 
programming language. 


"WE The amateur C programmer who requires a no-nonsense reference 
guide. 


WEN Everybody who requires a concise and informative guide to all the 
aspects of the C programming language. 


What You Should Know 


To get the most out of this book, you should have some basic programming 
knowledge; you don't need to have spent five years of your life learning 
assembler, but you should understand the basic concepts that are common 
to most programming languages. Since we are going to take this tour at 
quite a fast pace, we won't be dwelling on basic programming techniques 
that you already know, and we won't be discussing specific features of 
systems that you may or may not have. What you will be learning is how 
to program in C, and how to write programs that you want to write. 


If you have no programming experience at all, don't worry, because we 
start from the very beginning. We won't be shirking our responsibilities to 
you, because we are covering everything you need to know, but be prepared 
- we're not going to dally with every last detail of the language. 


As all programming languages require a basic grasp of mathematics, we will 
assume that you understand some of the principles behind the major 
concepts. Don't worry, we won't be using calculus and umpteen different 
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theorems, but you simply can't avoid brushing against the subject when 
you're dealing with programming. 


Conventions Used 


To help you to find your way around this book easily, we have used 
various styles to highlight different references. Each style has been chosen to 
give you a clear understanding of the information that we have supplied. 


Dialog 


All code and programs are highlighted with a gray background, so that you 
can locate them easily. For example: 





or Second, Third) ; 








When we need to show the whole program, we sometimes repeat parts of 
our code. We have shaded lines, which are new additions to the program, 
and those lines which are repeated, are left unshaded. This will enable you 
to immediately see where new code has been added: 


struct Phone 

( 

char *pName; o i /* Pointer to a name */ 

a p ez ou i m poleer to telephone number */ 
struct Phone *pLeft; M | TA Pointer to Phone object«current */ 
struct Phone *pRight; /* Pointer to Phone object»-current */ 





) i 
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Fonts and Styles 


Throughout Instant C we will consistently use the following styles and fonts 
for a variety of textual distinctions: 





When code or a command is mentioned in the middle of a sentence, 
we write it in this style, so as to emphasize its origin. Filenames, 
such as FILENAME.C, however, are always written in that style. 


Important words are introduced in this style. These are significant 
words that we are meeting for the first time. Subsequently they will 
appear as normal text. 


Actual keys that you press will be displayed in this style, for 
example, press the Return key. Note that Ctrl-K depicts the depression 
of a Control key with the K key. 


Output text that appears on your screen, such as field names, menu 
items or headings appear in this style. 


When we introduce the syntax of code, we will use the following 
bracket styles: 


[ ] Optional. 
< > Obligatory. 
MID lf a word needs to be emphasized, then we will italicize it, like this. 


We have attempted to break the text up, by using appropriate and 
consistent headings and with the judicious bulleting of lists. 





When an important piece of information needs to be really 
emphasized, then we will place it in a stand-alone box like 
this. 


C Variance 


C is such a popular language that it forms a highly competitive market, 
which, while organizations such as ANSI (see Chapter 1) strive to develop 
standards, spawns a host of different modes and environments where 
conventions differ. When using C, you must bear in mind that different 
systems, variants and versions will produce different responses and output. 
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Tell Us What You Think 


One last thing: we've tried to make this book as enjoyable and accurate as 
possible. We are here to serve the programming community, so if you have 
any queries, suggestions or comments about this book, let us know. We are 
always delighted to hear from you. 


You can help us ensure that our future books are even better, by simply 
returning the reply card at the back of the book or by contacting us direct 
at Wrox. For a quick response, you can also use the following e-mail 
addresses: 


feedbackQ wrox.demon.co.uk 
Compuserve: 100063,2152 


Please return the reply card at the back of the book and tell us what you 
think of the book, the style of presentation and the content. We are always 
ready to listen to comments and complaints (although we do prefer 
unadulterated adoration!). 


Getting Started 


Thank you for buying Instant C. We hope you enjoy it and become a 
proficient programmer of C in as short a time and as smoothly as possible. 
All our efforts have been aimed at bringing you maximum satisfaction, and 
if you want to learn C, then we're convinced that this book will fulfill all 
your immediate requirements. Anyway, you've bought this book to learn C, 
and we shall not hold you from it any longer. Dip in and enjoy! 
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Programming in C 


This chapter is just a toe in the water. We'll start by looking at the general 
characteristics of C, and why it's still the language of choice for so many 
development environments. We'll also get a feel for how the strategy of 
developing a C program proceeds. In this chapter you'll learn: 


What the primary advantages of C are. 
How a C program is structured. 

What a simple C program looks like. 
What libraries are and how they're used. 


How C programs are processed in some typical development 
systems. 


What differences arise with C programs in different computer 
environments. 
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The Characteristics of C 


If you were only allowed one programming language, you would most 
likely choose C, as it has so many advantages over all the others available. 
5o just what are these advantages? 


Prevalence 


C was originally developed in the UNIX environment by Dennis Ritchie in 
the 1970s. Since then, because it has proved to be such an effective 
programming language, it has become available in the majority of computing 
environments. 








If you know how to program in C, then you can write i 
programs on almost any type of computer. 





You will find C compilers on personal computers, UNIX workstations, 
minicomputers, as well as mainframe computers and many embedded 
microprocessors in control applications. This prevalence has also had a large 
effect on the cost of C compilers, with many excellent versions dropping 
their cover price dramatically. 


Portability 


Program portability is the ability to transfer a program in source form from 
the original development environment to different computers, and to 
successfully generate a working version with little or no effort. With an 
ANSI standard defined for the language, you have the potential to write 
code that can be easily moved from one machine to another. 

















In 1983, the American National Standard's Institute (ANSI) 
commissioned a committee to standardize the C language. 
Finally ratified in 1990, ANSI C standardizes existing practice, 
includes enhanced features and formalizes library support 
routines. 











The value of this in large commercial application developments is hard to 
overstate. 


Features 








of C 





Flexibility 


Most of the UNIX operating system was written in C, as was Microsoft 
Windows, and it remains a preferred language for systems programming. C 
provides you with the ability to write low-level code (efficient, but difficult 
to learn), whilst retaining all the advantages of a high-level language (less 
efficient, but easy to learn). For this reason, many commercial applications 
are written in C. 


Easy to Learn 


Because C is so compact, you will find it very easy to learn, so you can 
become a competent programmer very quickly. Of course, writing good 
quality code needs time and experience, as with all programming languages, 
but the volume of existing code that you can use as a model and the range 
of commercial tools to help you is of enormous assistance. 


Efficient 


As C is a compact language with a simple structure, it's easy to generate 
efficient machine code directly from it. The pre-written support libraries, 
which are implicitly used, are professionally written and optimized. There 
are also specialized high performance libraries available. Many people 
program in BASIC because it's easy to use, but it has the disadvantage that 
it's usually executed via an interpreter. If you contrast the performance of 
an application written in C with that of an equivalent BASIC program, then 
you'll find that the C program can execute much faster. 


Variance 


To maintain its position at the top of the programming tree, C has given 
birth to several new versions. New languages such as C++ are taking the C 
language as a subset, whilst making additional use of new programming 
methods and styles. 





C++ development systems will also compile C, so if you want 
to use C on a PC, and you think that you might want to use 
C++ some time in the future, then purchasing C++ might be 
a cost-effective option. 














Lets make a start by looking at how a C program hangs together, and get 
a grasp of some of the basic terminology. 
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The Structure of a C Program 


For the moment we are just going to get the basic concepts straight, 
without getting into specific details. All of the examples of C programming 
that appear here are just to illustrate the principles involved, and we will 
be looking at what they do in much more depth in the following chapters. 


Lets see what a complete C program looks like. Here's a simple one 
consisting of the function main() plus one additional function Response (): 


We will now discuss these main aspects of a typical C program. 


Comments 






Header 
File 
Statement 
Block Input/Output 
Statement 
/ Source 
Keywords <$ Code 
Whitespace Function 


Program 


Structure 





Statements 


The basic unit of C programming is a statement, and the collection of 
statements that make up a program are referred to as the source code. A 
statement in C always ends with a semi-colon. An example of a statement 
is: 





The effect of this statement is to add together the values of two things 
called MyWeight and vourWeight, and to store the result in something 
called ourWeight. Statements are generally executed in sequence in a 
program, unless a statement specifically changes the sequence of execution. 


Whitespace 


The example statement mentioned above has several spaces embedded, to 
make it more readable. It wouldn’t matter if they were omitted and written 
as ‘OurWeight=MyWeight+YourWeight’, because the C compiler can separate 
out the component parts, or ‘tokens’, which make up the statement. 


These filler characters are called whitespace characters and also include 
space, newline, carriage return, tab and form feed. You can put in as many 
as you like and even space a statement out over more than one line. The 
exception is within a quoted string, which we'll discuss later. 


You must use whitespace where there could be ambiguity; for instance, if 
the statement ‘int number = 0;’ was written as ‘intnumber = 0;’, then the 
compiler believes that you’re referring to a variable called intnumber. We 
could quite happily write ‘int number=0;”, because the equals sign and the 
zero are correctly interpreted by the compiler, even without whitespace. 


Blocks 


Statements can be grouped together in a block by placing them between 
braces. An example of a statement block is: 
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This block of code specifies two things called Myweight and vourWeight as 
having values of 180 and 300 respectively, and then adds them together 
putting the result in something called ourweight. Note how the statements 
are indented within the block. This is a visual cue to the extent of the 
block. Indenting statements isn't mandatory practice and it isn't part of the 
language definition - the compiler will compile correct code regardless of 
how it's laid out. However, indenting statements properly to make a 
program more readable is good programming style, so you should get into 
the habit of indenting statements appropriately from the start. 


Wherever you can write a statement within a C program, you can also have 
a block of statements, so blocks can be nested within one another. We'll 
come back to the subject of containing statements within a block in Chapter 
(A 


Functions 


A program in C consists of one or more functions. A function is a self- 
contained block of program code, which performs a specific set of actions or 
calculations and has a name by which it is referenced. It can have values 
passed to it, and it can return a value. A simple function might look 
something like this: 





This particular function calculates three times the value that is passed to it. 
The first line of the function definition specifies the name of the function, 
what kind of data is passed to it and what sort of value it returns. The 
computation that the function performs appears in the statement that sits 
between the braces. There will usually be several statements between the 
braces. The program statements making up the function are called the 
function body and are always enclosed between braces. In the previous 
example, the body of the function consisted of only one statement. We'll see 
more about how functions are defined in Chapter 5. 


Function Execution 


A function is executed by calling it in a program statement using the 
function name. We could use the function called trebie() that calculated 
and returned three times the value passed to it, by using the statement: 


Program 








Structure 





Result = treble( 5); 0” 


The value 5 between the parentheses, called the argument, is passed to the 
function treble(), and the value that is returned from the function is 
stored in something called Result. When a function is being referred to in 
the text within this book, it will always be written with parentheses after 
the function name - as in treble(). This is to distinguish a reference to a 
function from references to other things that have names, which we will 
discuss later. 


Any given function can be called many times from different points in a 
program. 


The Function main() 


A C program always contains a function called main(), and execution starts 
with its first statement. Let's consider the execution of a hypothetical 
program: 


| Fu nction1() 





| Function1() 






Program Start 





Function1() 
) 














Function1() 
) 








Program Code Program Execution 








The Functional Structure of a Program 
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On the left, the illustration shows a representation of the code for a 
program containing three functions. On the right, the diagram shows the 
sequence in which the functions making up the program might be executed. 
The arrows indicate how the sequence of execution of the statements in the 
program passes from one function to another, and then back again when 
execution of a particular function is completed. 


As with all C programs, execution starts at the beginning of the function 
main(). The function main() first calls Function1() and then Function2(). 
Function1() is also called twice from within Function2(), so it's actually 
used a total of three times within the program. At the end of program 
execution, control passes back to where it started from - the operating 
system. 


A practical application written in C will typically consist of a large number 
of small functions, each with a well-defined purpose. Functions other than 
main() may be called in any sequence from anywhere in the program, as 
many times as required. As we shall see when we get to discuss writing 
our own functions, a function can even call itself. 


Source Code 


With a small C program, the complete program text (the source code) can be 
contained in a single file. However, with large C programs, storing the 
complete program, which may run to thousands of lines of code, can 
become quite unmanageable. Therefore, the source code for a complete C 
program can be evenly distributed across several separate program files in 
order to manage it efficiently. 


Comments 


Comments are included in a program to explain how it works. They aren't 
part of the program and are ignored by the compiler, they're just there to 
help the programmer read it more easily. The text of a comment is bounded 
by /* and */. For example, the first line of a program could be: 





The comment above only covers part of a single line, but they can span 
several lines, for example: 


Program 
Structure 





If you want to highlight some particular comment lines, you can always 
add characters to embellish them with a frame: 


CJ SSWWEATAWRENRRENAERETATEYA: 





Good Practice 


As a rule, you should always comprehensively comment your programs. 
They should always be sufficient for another programmer to understand the 
purpose and workings of any particular piece of code. You should also 
comment any unfamiliar terminology, new concepts and additional 
information. Throughout this book we will be adopting the standard Wrox 
policy of thorough, easy-to-follow commenting. This will enable you to 
understand every step of every program at the appropriate point. 


Like many programmers, you'll find that putting comments in a program is 
an awful chore, unless you get into the habit of putting them in early on in 
your programming career. However, it's worth steeling yourself to make the 
effort. If you don't pick up the habit now, then the first time that you have 
to fix a program without adequate comments will be an educational 
experience. 


Keywords 


There are several words with very specific meanings which form part of the 
C language. These are called keywords. Examples of keywords are long, 
which defines something as having integer values, or sizeof, which is an 
operator. Keywords are reserved words, which means that you can't use 
them for any other purpose - if you do, the compiler will get confused. 
We'll identify each particular keyword as we work through the language. 
Appendix C contains a list of all the keywords in C. 
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Input and Output 


There is no input and output of any kind within the C language. These 
operations are provided solely by standard functions that are external to the 
compiler. Nonetheless, they are standard. You will find the same set of 
input and output facilities available with any ANSI-standard or ANSI- 
compliant compiler. The support for standard input is via the process of 
reading from a file called stdin. This usually corresponds to the keyboard, 
but it can be redirected from other files or pipes by an operating system 
command. Two standard output files are supported, called stdout and 
stderr, both of which are connected by default to the screen. Normal 
screen output uses stdout, while stderr is used for error messages. Note 
that while stdout may be redirected elsewhere, stderr is fixed. 


Some compilers provide input/output functions in addition to those defined 
as standard, but naturally there’s no guarantee that these will be available 
in other C development contexts. We'll stick to the standard functions in 
this book. 


By itself, input and output is a rather tedious topic. There's a lot to it and 
grinding through all the detail at a single sitting is about as interesting as 
watching paint dry. It’s also quite hard to take it all in, in one go. For this 
reason, although there is a specific chapter on file operations in this book, 
there is no specific chapter on keyboard input, or output to the screen. 
Instead we will develop our understanding of this piecemeal, as we 
introduce aspects of these operations in a useful context through examples 
in this book. Appendix A provides a complete rundown on formatting 
input/output. 


Libraries 


A library is a store for functions and is a very important aid to 
implementing C programs. They provide a means of vastly enriching the 
basic capabilities of the language, and offer support for an enormous range 
of applications. There is a set of standard libraries defined for C, and 
provided with every ANSI C system in a set of library files. These include 
basic input/output facilities, file operations, mathematical functions and 


Compilation 











many others. If you stick to the standard set, then you can rest assured that 
the same facilities will be available on any ANSI-standard compliant system, 
and that your program should run with minimal change, provided you have 
written it with portability in mind. 


To use a particular standard library, it's necessary to incorporate a file called 
an include file, or a header file, into your source program file. These contain 
information necessary to enable your program to use the standard library 
functions, as well as definitions of standard symbols of various kinds. We'll 
be discussing the standard libraries as we learn the language, and Chapter 7 
covers the use of libraries as a specific topic. 


Whatever C compiler you're using, you're certain to have some additional 
libraries supplied with it, beyond those within the standard set. Once you're 
comfortable with the C language, a little time spent investigating what these 
libraries contain will pay substantial dividends. In most cases you'll find 
capabilities that will save you a considerable amount of time when you're 
writing your own application. A vast range of other C libraries are also 
widely available from any popular and reliable source. 


Compiling and Executing a C 
Program 


You will create your first C program using some kind of editor. The original 
program in C is usually referred to as source code and is saved in a source 
file. The process of converting your C program into a form that can be 
executed on your computer involves three further steps: pre-processing, 
which executes commands that alter the source file, compiling the source 
code, which converts the C language statements to machine code and 
linking the output from the compilation process, which adds library 
functions and knits it all together. The pre-processing phase is normally 
integrated with the compile operation, although with C in the UNIX 
environment, it is usually possible to obtain the output from the pre- 
processing phase before it is compiled. The overall process of generating an 
executable program is shown here: 
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—————— 


Of course, if errors are produced at any particular stage, then the final 
output file from processing the program isn't produced. In almost every 
case, an error necessitates going back and fixing the source code using your 
editor. Getting through to having something you can execute doesnt mean 
that you have a working program, there may still be errors in the logic of 
the program. These errors can't be detected by the compiler or the linker 
and are usually the hardest to locate and understand. 


Let's now take a look at the main steps in producing an executable C 
program. 


Editing 


This is the process of creating or modifying your source code. The result of 
this process is a text file containing the source code for your program. By 
convention, your C source file has the extension .C, so you should use 





EA A MINE MEME QE EE RR RR d ERE I M 


18 


Editing 


Environments 





recognizable labels like MYPROG.C, TRYOUT.C or some other descriptive name. 
Most compilers will expect the source file to have a name with the 
extension .C. 


Text Editors 


Many modern compilers feature an integrated development environment', 
which includes an editor, specifically designed for editing programs. If yours 
doesn't, then you can use a text editor, such as vi or emacs under UNIX, 
or EDIT under MS-DOS. Using word processors, like Word or WordPerfect 
though, isn't usually a good idea, because the additional formatting 
information which they include in the file will cause the compiler to choke. 
You can save the file without the formatting tokens, but it’s much easier to 
use a simple text editor. 


Editing Environments 


Many commercial C compilers have their own specific editor that provides 
assistance in managing your programs and helping to minimize errors. 
Indeed, products from both Microsoft and Borland provide a complete 
environment for writing, managing, developing and testing your programs. 
Here's a typical commercial editing environment: 


ne TE, 
A ^* A simple program in € 


Hinclude <stdio.h> 
E void Response (char*) ; 


e dii main.) 

char Name[8B]; 

printf ("\nEnter pe name: "); 

scanf ("xs", Name); 

: Response (Name) ; 

ee ; return 0; 

? uis Response (char* Name) 
Seni ud char #pMsg[] = 


"How are you?", 
“Have a nice 
"n 


CEA cai 
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This provides a full screen editing environment where you can make 
additions or modifications to your program simply by positioning the cursor 
where you want to make a change, and typing it in. The editor also 
provides syntax highlighting, with different language elements displayed in 
different colors. This gives you visual cues to where you have made errors 
in entering a program. 


Compilation 


This is the main process of translating your original C program into a 
machine language that the computer can directly execute. Before the 
compilation process starts, all the pre-processor commands are executed. 
These instructions generally modify your source file in various ways - 
usually by adding the contents of a file, modifying statements within the 
program, or by selectively including or excluding portions of your source 
file depending on initial conditions. We'll look into how pre-processor 
commands are used in Chapter 9. 


Error Generation 


The compiler will detect several different kinds of errors during the 
translation process, and most of these will prevent the machine language 
module, usually called an object module or object file, from being generated. 
Various messages are generated by the compiler to tell you what sort of 
error has been detected. For example, the TC++ for Windows environment 
will show you something like this: 


«| Compiling CABAD.C: 
1| Warning CABAD.C 12: Call to function 'Response' with no prototype 
Error CABAD.C 13: Undefined symbol ‘retun' 
I Error CABAD.C 13: Statement missing ; 
"Warning CABAD.C 14: Function should return a value 
"Jj Error CABAD.C 17: Type mismatch in redeclaration of ‘Response’ 
"Error CABAD.C 18: Declaration terminated incorrectly 


"Error CABAD.C 26: Unterminated string or character constant 
4 Error CABAD.C 27: Function call missing | 
"^ Error CABAD.C 27: Unterminated string or character constant 
Warning CABAD.C 30: Parameter 'Name' is never used 
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Cascading Errors 


Initially a common cause of confusion is that just one error can result in a 
whole stream of invective from your compiler, referring to a large number 
of different errors. Here's an example of this, using the Borland command- 
line compiler to compile the Ex1-01.c program: 


Borland C++ 4.5 Copyright (c) 1987, 1994 Borland International 

ex1-01.c: 

Error ex1-01.c 25: ( expected in function Response 

Error ex1-01.c 29: Declaration syntax error in function Response 

Error ex1-01.c 29: Declaration missing ; in function Response 

Warning ex1-01.c 29: 'pMsg' is assigned a value that is never used in function 
Response 

Warning ex1-01.c 29: Parameter 'Name' is never used in function Response 
Error ex1-01.c 31: ) expected 

Error ex1-01.c 33: Declaration terminated incorrectly 

Error ex1-01.c 34: Unexpected } 

*** 6 errors in Compile *** 


This erroneous program has the opening brace missing, but everything else 
is okay. This is detected by the compiler and noted in the first error 
message, but then we get five other error messages and two warnings, all 
derived from the same error. This is often called a cascading error, or an 
error cascade. 


It’s not a deficiency in the compiler, but an erroneous statement is outside 
the rules, so there's always the possibility that various other things could be 
wrong. Also, an error in one statement can easily make another incorrect, as 
it may leave things undefined that subsequent statements assume exist. 


Error Revision 


The above error is very obvious, but it isn't always so. The basic approach 
that is adopted most often is, after considering the messages carefully, to fix 
those errors that you know you can, and have another go at compiling. 
Always correct errors in order, so that cascading errors are hopefully all 
removed in one step. All errors at this and later stages usually necessitate 
going back to the editor and re-editing the source code. Of course, if only 
one source file is in error, then you only need to go back, edit that file and 
recompile it again. When you finally succeed in compiling your program, 
you'll have the object files ready to be input into the next phase but this in 
itself doesn't guarantee that your program is error-free - errors in the logic 
of the program can't be picked up by the compiler. 
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Environment Help 


Compiling in the development environment supplied with a compiler can 
provide a little additional help in finding errors. Running the same example 
through the Borland Turbo compiler development environment produces the 
output shown here: 


File Edit Search Run Compi le Debus Project Options Window Help 


EXAMPLE .C —— —_ 1. 










static char #pħsg[] = 













"only the mediocre are always at their best? , 
a nod is as good as a wink to a blind horse? , 
"your Muto will be quieter with a catatonic converter?", 


"a fool and your money are soon partners?" 
3; 








/* Select response based on first letter of name */ 

printf("NnHi “s\nDid you know that “s\n Enjoy your programming!" 
Name, plisg[ Name[0174 1 ); 

8:10 





Cal 
Compiling EXAMPLE .C: å 
Error EXAMPLE.C 8: € expected 
rror EXAMPLE. : Declaration syntax error 
Error EXAMPLE.C 12: Declaration missing ; 
Warning EMAMPLE.C 12: Function should return a value 

Warning EMAMPLE.C 12: ’pMsg’ is assigned a value that is never used 
Error EXAMPLE.C 15: ) expected 
Error EMAMPLE.C 17: Declaration t 





Message ss "imd | t ]= 
















erminated incorrect 






Fi Help Space View source «< I Edit source F10 Menu 


Here, the line of code where the error was recognized is highlighted, which 
in many cases is the line you need to fix. With an interactive development 
environment it becomes very easy to step through each of the errors in 
sequence, correcting them as you go, but you always need to remember the 
possibility of a cascading error. 


Linking 


Linking your program is the process of integrating everything into a single 
executable file, bringing in library functions where necessary. Under UNIX, 
this is performed by the loader; under MS-DOS, a proprietary linker is 
usually supplied with the compiler. 


Operating 


Systems 





A frequent cause of linking errors when you've just installed a compiler 
system is that your environment isn't correctly set up, or that you haven't 
specified where the libraries are to be found. If the linker (or loader) is 
unable to find the library functions it needs, then it can't create an | 
executable version of your program. The error messages that you're likely to 
encounter depend on the environment in which you are working, although 
they usually give a clear indication of what is missing. You'll need to turn 
to your documentation to make sure that all the setup options are in place 
and correct. 


Execution 


Once an executable file has been produced (it should have a .EXE 
extension), it can be run. This is usually achieved by selecting the Run 
option from the relevant menu in your editing environment. If you don't 
have an interactive editing system, then you can simply execute it as you 
would any other application - directly from the command-line prompt or 
program selection utility in a Windows environment. 


Operating System Effects 


The environment of your operating system may affect how you implement 
your programs. The UNIX environment, for example, provides mechanisms 
where one program can initiate another, and where one program can 
communicate with another. While this may affect the overall approach you 
might take to implementing an application, it won't affect the C language 
implementation, and therefore won't affect the programming techniques that 
you'll use. 


UNIX also differs from DOS in the way that it calls a program for 
execution: more information is passed to the function main(), than in DOS. 
We'll look at this in Chapter 5. There are also a whole range of system 
functions for communicating with the operating system, and these vary 
considerably between environments. 


Throughout this book we'll be sticking to ANSI-standard C and trying to 


avoid getting embroiled in system specific aspects of all the particular 
environments where you may be using C. 
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Summary 


With ANSI-standard implementations of C now generally available, C is an 
excellent choice of programming language for just about any environment. 
Of course, for UNIX workstations and IBM compatible PCs, it's still the 
most widely used computer programming language available. It has the 
merit of combining power with simplicity, and ease of use with ubiquity. If 
you've used any other general purpose programming language, then you 
should find C easy to learn and much more effective, and if you haven't 
written a program before, then your choice of a language to start out with 
is excellent. 


In this chapter we have seen that: 


A C program is defined in terms of statements. These may include 
keywords, which are reserved words that can't be used for any other 
purpose. 


Statements may include whitespace, which consists of blanks, tabs, 
comments or newline characters. Whitespace is used to space out 
statements for readability and is ignored by the compiler (except 
within a character string constant). 


A C program consists of one or more functions. A function is a self- 
contained block of code, which performs a particular calculation and 
is executed by stating its name. There's always a function called 
main() in any C program, and execution starts at the beginning of 
it. 


The source code for a C program is contained in one or more source 
files. A program can be spread across as many source files as you 
find convenient. 


C programs use functions provided by standard libraries. To use the 
functions in a standard library, you must include the appropriate 
header file in your source file. The header file, also referred to as an 
include file, provides standard definitions relating to library functions 
necessary for the compiler to process your program correctly. 


Summary 








@ The translation of a C program into a form which can be executed 
involves two steps: compiling, which produces a set of machine code 
files for the program from the source files and linking, which adds 
functions from libraries and assembles the object modules into a 
single executable program module. 


Now that we have a feel for the generalities, it’s time to get down to 
specifics. We start programming in C in the next chapter. 
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Chapter 








Variables, Data Types, and 
Computation 





You're now going to learn about the fundamentals of computing in C - how 
to read data into a program, how to calculate things with it, and how to 
show some results. By the end of this chapter you will understand: 


What variables are in C. 


What kinds of data you can handle in C. 


How to define and name variables, and how to specify constants of 
various kinds. 


How to perform arithmetic calculations and what operators we can 
use. 


What bitwise operations are, and how they work. 
The rules governing the sequence of calculations in C. 


How operations between values of different types are carried out, 
and how a value of one type can be converted into another. 


What is meant by variable scope. 
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Variables 


A variable is a named bit of memory in your computer, where you can 
store some piece of data. It's called a variable because the data can be 
changed as the program progresses. Variables can store various kinds of 
data, but each individual variable can only hold one particular type of data, 
specified when you first define it. 


Let's now look at the rules for naming C variables and some of the 
different types of data that we can store. 


Naming Conventions 


The name we give to a variable is called an identifier, or rather more 
conveniently a variable name. It's a very flexible system whereby identifiers 
consist of a string of letters, digits and the underscore character, but must 
begin with a letter. Some examples of valid variable names are: 


Cost debit pShape value MAXIMUM DimeStore 


A variable name cannot include any other characters and mustn't start with 
a digit, so 8 Ball, 2big, and 6 pack aren't allowed. Neither is Hash! or 
Jim-Bob. This last example (using a dash or a hyphen) is a very common 
mistake; Jim Bob would be quite acceptable though. Of course, Jim Bob 
wouldn't be allowed because whitespace characters aren't valid characters in 
variable names. Note that the variable names democrat and Democrat are 
not names for the same variable, since upper and lower case letters are 
treated as distinct characters. 


Since an underscore character counts as a letter, you can define variables 
with names starting with an underscore character such as This and That, 
or even | Those (with two underscores). This is best avoided however, 
since there are pre-defined variables within the standard libraries that also 
take these forms, so you could quite conceivably clash with them, 
accidentally causing serious problems. 


Apart from variables, there are quite a few other things in C that have 
names, and they can all have identifiers of up to 31 characters in length. 


Naming 





Conventions 





Their names have the same definition constraints as variable names, so the 
name of anything in C is governed by this one set of rules. 


ANSI Identifiers 


In ANSI standard C, at least 31 characters must be Significant for identifiers. 
This means that all ANSI standard C compilers will process and 
differentiate variable names with up to 31 characters. Some compilers will 
support identifiers with names over 31 characters, but ignore excess 
characters, and others will even differentiate names with more that 31 
characters regardless. For most purposes 31 characters in a name is more 
than adequate, and if you regard this as a hard limit then you can be sure 
that your variable names will be acceptable in any compiler that is ANSI- 
standard compliant. 


Hungarian Notation 


You can call your variables by whatever names you like within the 
definition rules, but a systematic approach to naming them can help you to 
avoid some common errors. 


One approach that was used in the code for Microsoft Windows, is 
Hungarian Notation. This uses a prefix of one or more characters to each 
variable name, providing an indication of what kind of data the variable 
contains. A few of the more common prefixes you may come across in C 
are shown here: 


c char p pointer 
i int s string 
1 long w word (unsigned int) 


Sometimes the prefixes can be several characters deep, where more 
complicated entities are used, which can make the names quite long. 


We won't be going the whole hog in the examples in this book, but we will 


be using the prefix p for pointer names, and as you will see later, pointers 
have a particular potential for misuse. 
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Data Types 


The kind of data your variable is going to store is obviously very 
important. Recording the number of persons in a string quartet requires a 
rather different kind of capacity to that required to hold the current gross 
national product. There are three basic kinds of data values that you can 
store in C variables: whole numbers (usually referred to as integers), 
characters, and floating point numbers. 


Integers 


The basic integer type in C is specified by the keyword int. You specify a 
variable of this type in a declaration statement, for example: 





This statement declares a variable called my. number, that is of type int, and 
therefore can only be used to store whole numbers. All variables must be 
declared before you use them. A variable of type int will normally occupy 
2 bytes of memory, however it can be more on some machines. 


long 


Since a 2-byte integer contains 16 bits, it can store values between -32,768 
and 32,767 (assuming that we're using a machine featuring 2's complement 
representation for negative numbers - the majority do, and we will continue 
to assume so throughout this book). 






Most computers represent such numbers using the two's 
complement notation, where the left-most bit denotes the sign 
bit. If this bit is 1 then the number is negative, and if the 
bit is 0 then the number is positive. This method of 
representation makes it a lot easier for the CPU to perform 
arithmetic operations. 










In a machine that doesn’t use 2’s complement arithmetic, the difference is 
slight - the range of values being between -32,767 and 32,767. In either case, 
the range isn't enough for many purposes, so the type long int is also 
available, which can be and usually is abbreviated to long. You can declare 
a variable of type long in a similar way to that of type int, but using the 
long keyword instead: 
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A variable of type 1ong occupies at least 4 bytes, and so it will be able to 
store values in the range of -2,147,438,648 to 2,147,438,647. This gives us a 
little more breathing space, but it's still not enough for the GNP of the USA 
(we'll discuss how we can do that a little bit later). 


short 


A third variation on integer variable types is short int, usually 
abbreviated to short. This is normally the same as int, except when int is 
the same as long, in which case short will be shorter, if you see what I 
mean. The essential idea is that a short variable should be smaller than a 
long. An example of a short variable declaration is: 





The range of values that you can store in a short variable is implementation 
dependent. Commonly, a short variable will occupy 2 bytes, which would 
allow for values from -32,768 to 32,767, but sometimes it's only 1 byte in 
which case you would have a miserly range of just -128 to 127. 


An Example of Using Integer Variables 


Let's try a simple example of using integer variables: 
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Program Analysis 


Apart from the explanatory initial comment, the example starts with a pre- 
processor directive, causing the library header file, STDIO.H, to be included 
in the program. This include file contains all the necessary definitions for 
our program to display output to the screen, and receive input from the 
keyboard. 


The rest of the program file consists of the single function main(). Note 
how each statement in the body of the function is terminated by a semi- 
colon. You will find it very easy to forget semi-colons when you first use C, 
but you'll soon get the hang of it. 


The function main() starts off by declaring three int variables, Apples, 
Oranges, and TotalFruit in separate statements. Next are four assignment 
statements, so called because they 'assign' a value to a variable. The first 
two of these assign integer values to the variables Apples and Oranges. 
The next takes the current value stored in Apples, adds 5 to it, and stores 
the result back in the variable Apples. The last assignment adds the values 
in the variables Apples and Oranges together, and stores the result in the 
variable TotalFruit. 


After a comment line the function printf() is called, which allows you to 
print text and variables to the screen. 


If you compile and execute the previous program example, it will produce 
the output: 


Total number of fruit = 40 


We will take a look at how we actually get this output, and how the 
printf() function works, a little later on. 


Integer Constants 


We used the integer constants 10, 25, and 5 in the previous example. If a 
number is written without a decimal point (e.g. 92) then it's taken to be an 
int; if it has an /L/ appended to it (e.g. 921) then it's a long. You could 
use a lower case letter 1, but it's easily confused with the digit 'one', so it's 
better to consistently use the upper case version. 


Declaring 
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Note that the commas you would normally use to write a number such as 
99,999 aren't used in writing C constants, and will cause an error. You 
would write this number as 999991. 


An integer written with a leading zero, 0123 for example, is understood to 
be an octal constant in C. Thus it has the decimal value 83. Octal is rarely 
used now - modern computers invariably have a word length which is a 
multiple of 8 bits, although some special purpose microprocessors still use 
octal representation. 


Integer constants can also be specified in hexadecimal form, that is base 16. 
A hexadecimal number can have digit values from zero to fifteen, written 
using the standard representation of 0 to 9, and A to F (or a to f) A 
hexadecimal constant is also preceded by 0x (or ox) to distinguish it from a 
decimal value. For example, the decimal value 123 could be represented as 
the hexadecimal constant 0x7B. For a crash course in hexadecimal number 
representation, see Appendix B. 


Declaring Several Variables 


All the examples of variable declarations so far have specified just one 
variable to name. You can also declare several variables of the same type in 
a single statement. For example: 


ng Valuel, Value2, BigNumber; 





Some programmers prefer to declare each variable in a separate statement 
because it can be a little clearer, particularly when comments are needed to 
document what they are for. However, there is no hard and fast rule on 
this - it’s purely a matter of taste. 


Initializing Variables 


You can assign an initial value to a variable when you actually define it. 
For example, to declare a long variable, Distance, and initialize it with the 
value 93,000,000, you would write: 


J Distance = 930000001;  /* Distance from the earth to the sun */ 





Where multiple variables are declared and initialized, they're separated by 
commas in the normal manner: 


uartet = 4, Octet = 87 | 
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This type of declaration that also initializes a variable is called a defining 
declaration, because it causes memory to be allocated for that variable, and 
its value to be defined. Variables that aren't initialized will contain a 
'garbage value'. 






When you switch on your computer, everything that you need 
to reside in memory is automatically loaded. Those areas of 
memory that aren't used will contain values that no one can 
predict - garbage values. Such values are wildly unpredictable. 


It’s usually rather unhelpful to have garbage floating around in your 
program, so it's good practice to initialize variables every time. This avoids 
leaving spurious values around, and makes it easier to discover what is 
wrong if your program doesn't work. 


Character Variables 


Variables that can hold a character are specified using the keyword char. 
The char data type serves a dual purpose; it can specify a one byte integer 
variable or a variable storing a single character. On most, but not all 
computers, this will be an ASCII character. 





ASCII is the acronym for the American Standard Code for 
Information Interchange. Pronounced 'asskey', this 7-bit 
standard code was adopted to facilitate the interchange of data 
between different types of data processing and equipment. 







We will assume that we're programming a machine supporting the ASCII 
character set throughout this book, but we'll also address the implications of 
non-ASCII environments in Chapter 10. The ASCII character set appears in 
all its glory in Appendix B. 


Declaring Character Variables 


We can define a character variable with the statement: 





Note here that we specify a constant which is signified by single quotes, not 
double quotes. 
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Since a value of type char occupies 1 byte, it can store integer values from 
-128 to 127. A char variable can be treated as a character or an integer 
value interchangeably. Because the character ‘a’ is represented as the 
decimal ASCII value 65, we could have written: 





to produce the same result. 


We can also use hexadecimal constants to initialize char variables (as other 
integer types). Thus we could rewrite the last statement as: 





A character is always two hexadecimal digits, because two hexadecimal 
digits define 8 bits. 


An Example of Character Variables 


Here's an example of how we can use character variables: 





Program Analysis 


Here we have defined three char variables, initializing each in a different 
way. As you can see, we are able to use expressions as well as constants to 
initialize variables when they are being declared, as long as the expression 
evaluates to a constant. 


The printf() function (which we will be discussing shortly) will display 


the specified text in the format string, followed by the values of the three 
variables. 
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The values generated here can only be guaranteed on ASCII-based systems. 
If you want your program to be portable between different kinds of 
computers, then you need to avoid this built-in ASCII dependency. 


Integer Type Modifiers 


Variables of the integral types char, int, and long can contain signed 
values by default, so they can store both positive and negative values. This 
is because the type modifier signed is assumed for these types by default. 
So wherever we wrote char, int, or long, we could have used signed 
char, signed int, Or signed long respectively. If you're sure that you 
don't need to store negative values in a variable, then you can specify a 
variable as unsigned, where the sign bit is used as part of the data value 
allowing a larger maximum value to be stored. For example: 





In this case the minimum value that can be stored is zero, and the 
maximum is increased to 4,294,967,295. You can also apply the unsigned 
modifier to int as well, where such variables may assume values from 0 to 
65,535. Note how a vu (or a u) is appended to unsigned constant values. 


Both signed and unsigned are keywords in C, so you can't 
use them as variable names. 











In the previous example we have 'L' appended as well to indicate that the 
value is also long. You can use either upper or lower case for U and L and 
the sequence is unimportant too, but it's a good idea to adopt a consistent 
way of specifying such constants. 


Floating Point Variables 


Values which aren't integral are stored as floating point numbers. A floating 
point number has two parts, a decimal fractional part with a fixed number 
of digits called the mantissa, and an exponent which is the power of 10 by 
which the mantissa is multiplied. For example, the number 123.45 can be 
written as .12345x10°, so the mantissa here is .12345 and the exponent value 
is 3. The number of digits in the mantissa, and the range of possible values 
for the exponent are dependent upon the capacity of the computer system 
that you're using. 


float 


VETE JOE 





You can write a floating point number as just a decimal value such as 112.5, 
or with an exponent, such as 1.125E2, where the decimal part is multiplied 
by the power of 10 specified after the E (for Exponent). Our example 
therefore can be represented by 1.125E2 which is simply 112.5. Note that a 
floating point constant must contain a decimal point, an exponent or both. If 
you write neither then it's an integer. 


double Variables 


You can specify a floating point variable using the double keyword, as in 
the statement: 





A double variable typically occupies 8 bytes in memory and stores values 
accurate to 15 decimal digits, so we still have room for a much more 
precise value for the speed of light. 


The range of values stored is much wider than that indicated by the 15 
digits accuracy, because it's also determined by the range of possible 
exponent values. The precise range is dependent on the kind of hardware 
you are using, but for an IBM compatible PC it's from 1.7x10?* to LACA 
positive and negative. We're now able to deal with the GNP with room to 
spare. The number 10'% is called a googol, so here we're fully googol 
enabled. Unfortunately, the googolplex is out though - since it's 10 to the 
power of a googol, or 108eo8ol, 


float Variables 


If you don't require 15-digit precision, and you don't need the massive 
range of values provided by double variables, then you can opt to use the 
float keyword to declare floating point variables occupying 4 bytes (on a 
PC). For example: 





This isn't, as you might have imagined, the volume of a takeaway beer, but 
the conversion factor from a US pint to the Japanese ‘go’ unit of measure. 
In case you ever need it, there are 10 ‘shakus’ to the 'go'. The £ at the end 
of the constant specifies it to be of a float type. Without the £, the 
constant would have been represented as a double. Note that if you want 
to write a float value with an exponent, then the suffix must go at the very 
end, as in: 
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Variables declared as float are of 7-decimal-digit precision on an IBM 
compatible PC, and can have values from 3.4x10?* to 3.4x10™, both positive 
and negative. 


i 
| 


long double Variables 


Another floating point type that can provide even more precision on some 
computers is a long double. In some implementations (Borland C++ 4.0 
for instance), variables of this type provide 19-digit precision, and support 
numbers in the range of 3.4x10'9? to 1.1x10% . 


You define constants of this type with the suffix 'L', and as with all suffices 
- it can be in lower case as well. An example of defining a variable of this 


type is: 





This defines a value for the ratio of the circumference of a circle to its 
diameter, with 19-digit precision. 


Named Constants 


Usually, circumstances can arise quite frequently when you want some 
variables to have a fixed value that shouldn't be changed during the 
execution of a program. The last example is a case in point. Having defined 
the value for Pi you'll probably never want to change it. Prefixing the 
const keyword does the trick, for example: 





You can use the const qualifier in the declaration of any variable that you 
don't want to be changed. Naturally any const variable must have an 
initial value assigned to it, because it's impossible to assign anything to it 
later. 
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Boolean Types 


C doesn't provide a standard boolean (or logical) type, so you must find 
another way of representing them. You could use integers to represent them 
(i.e. 1 and 0), which are faster than characters, but characters could save 
you some data space. For this reason, the designers of C decided that this 
space/time trade-off should be left up to the programmer. 


Apart from using messy variables, you could use any of the following: 


#define TRUE 1 #define YES 1 
#define FALSE 0 #define NO 0 
enum boolean{false, true}; enum boolean{no, yes}; 


We will be discussing True and False in greater detail in the next chapter. 


Enumeration 


You will sometimes be faced with the need for variables that have a 
limited set of possible integer values, for instance the days of the week or 
the months in the year. We have a specific facility in C to handle this 
situation called an enumeration. Let’s take the example of a variable that 
can assume values corresponding to days of the week. We can define this 
as: 


enum Week (Mon, Tues, Wed, Thurs, Fri, Sat, Su 





This declares an enumeration type called week, and the variable 
This week that is an instance of Week, so that it can only assume the 
values in parentheses. 













Note that if you try to assign to an enumeration variable such 
as This week anything other than the set of values specified, 
then your compiler won't necessarily flag this as an error. 
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Enumeration Constants 


The symbolic names listed between the parentheses are known as 
enumeration constants. In fact, the names of the days will be defined as 


having fixed integer values. The first name in the list, Mon, will have the 
value 0, Tues will be 1, and so on. If you would prefer the implicit 
numbering to start at a different value then you can just write: 





and they will be equivalent to 1 through to 7. Having defined the form of 
an enumeration, you can define another variable thus: 





This defines a variable Next week as an enumeration that can assume the 
values previously specified. 


Assigning Specific Values 


You can also assign specific values if you wish. We could define this 
enumeration for example: 





Here we've defined the possible values for enumeration variables of type 
Punctuation as the numerical equivalents of the appropriate symbols. If 
you look in the ASCII table in Appendix B, you'll be able to see that in 
decimal they are 44, 33, and 63 respectively. 


Obviously the values assigned don't have to be in ascending order. If you 
don't specify all the values explicitly, then incrementing values continue to 
be assigned from the last specified value, as in our second Week example. 


Defining Boolean Variables 


You could also use an enumeration to define the idea of logical variables, 
take the following for example: 


Printing 











Text 





enum Boolean (False, True) Bl, B2, B3; 


This defines three variables as having the ability to possess the values of 
False or True. 


Printing Text and Variables 


The simplest way of printing data to the screen is to use the printf() 
function, from the sTDro.H library. We have already seen a brief example of 
this, and it worked like this: 


Variable to be Displayed , 


Output Format Specified for Variable . N 


Format String N x. 


AA — - -— 9 Y 
"\nlotal number of fruit =[%d|", | TotalFruit | ); 











Argument List 
Function Name 
The argument list specifies what is passed to the function, and the 
arguments are separated by commas. In this example there are two 
arguments, a format string which specifies how the output is to be 


presented, and the variable TotalFruit which contains the value that we 
want to display. 


The format string is a sequence of characters enclosed within double quotes. 
It can contain two sorts of information: text to be displayed, and format 
specifiers which determine how the values of variables which appear in the 
argument list are to be presented. A format specifier always begins with a 
% symbol. In our example, the format specifier is *sà which indicates that 
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the value to be displayed is a decimal integer of type int. To output a 
value of type 1ong you would need to use the format specifier *1d. The 1 
specifier is called a length modifier. For a short variable, the length modifier 
is h so you would use the format specifier *hd. 







Note that the printf() function deduces the type of data that 
you are passing to it from the format specifier. If the format 
specifier doesn't match the variable type, then you won't get 
the correct value displayed. 







The format specifiers are matched in sequence with the variables to be 
displayed, so there should be the same number of format specifiers in the 
format string, as there are variables in the argument list. Of course, the 
format specifiers always need to be appropriate to the type of value to be 
displayed. 


Note that if you ever want to display the percent symbol, then you must 
use %% in the format string. 


Escape Sequences 


The wn at the beginning of the format string is called an escape sequence. 
This is because the \ character escapes from the standard interpretation of 
the string to interpret the following character in a special way. Here the n 
pair enable the representation of a newline character. 


There are several escape sequences you can use. Some of the particularly 
useful ones are: 


\a sound a beep \b backspace 

\n newline Nt tab 

y single quote XN double quote 
\\ backslash 


Obviously, if you want to be able to include a backslash or a double quote 
as a character to be output, then you must use the escape sequences to 
represent them. Otherwise the backslash would be interpreted as another 
escape sequence, and a double quote would indicate the end of the 
character string. 








Arithmetic 





Operations 





Arithmetic Operations 


All of the computational aspects of C are fairly intuitive, so we should slide 
through this like a hot knife through butter. The C arithmetic operators are: 


+ Addition 

- Subtraction 

i Multiplication 

/ Division 

% Remainder or Modulus (integers only) 


++ Increment (integers only) 


-- Decrement (integers only) 


Arithmetic Expressions 


The first four arithmetic operators in this table are similar to what you’re 
used to using in normal arithmetic. For example, the expression: 


2*2.5 + 8/2 


evaluates to 9. The multiply and divide operations are executed before the 
addition, just like normal arithmetic, because they are said to be of a higher 
precedence than addition or subtraction. We will come back to the question 
of operator precedence before the end of this chapter. 


The Remainder Operator 


The remainder operator only works with integers. It calculates the remainder 
when the left integer operator is divided by the right integer operator, so if 
A and B are integer variables with the values 10 and 3 respectively, the 
expression: 


A%B 
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has the value 1. The remainder operator has the same precedence as 
multiply and divide. Where more than one of these occur in the same 
expression, they are evaluated from the left to the right. The multiply, 
divide, and remainder operators have left to right associativity. 


The Divide Operator 


When the divide operator is used with integer values, the result is rounded 
down to an integer. With variables A and B having the values 10 and 3 
respectively, the expression produces the value 3: 


A/B 


The Increment and Decrement Operators 


These operators don't apply to floating-point variables. The increment 
operator increases the value of a variable by 1, and the decrement operator 
decreases a value by 1. So if A starts out as 10, after executing the 
expression ++A, it will have the value 11. Within an expression, their effect 
is slightly unusual. The operators can be placed before or after the value to 
which they apply, that is as either a prefix operator or a postfix operator. If 
the operator prefixes a variable, as in the expression: 


2*( ++A ) 


then assuming that A starts out as 10, it will first be incremented to 11, so 
that the value of the expression is 22. If, on the other hand, the postfix 
form is used, as in the expression: 


2*( A** ) 


then the value of A will be incremented after the value of the expression 
has been calculated. So if, as in the previous example, A starts out as 10, it 
will still end up as 11, but the value of the expression will be 20. If you 
omit the parentheses and just use the expression 2*A++, then the value of 
the expression will still be 20, and A will be incremented to 11 after the 
value of the expression has been calculated. 


Arithmetic 














Operators 





Using Parentheses 


You can always use parentheses in an arithmetic expression to ensure that 
the calculation proceeds in the order that you want it to. In any expression 
containing parentheses, all the sub-expressions are evaluated starting with 
the innermost and working to the outermost. This is easy to see with an 
example: 


( 2*( A+B) -C)*(D- E) 


The value of ( A + B ) will be calculated first. The result will then be used 
to evaluate the sub-expression ( 2*( A +B) - c ). Next, the value of ( D 
- E ) will be calculated and finally be multiplied by the previous result to 
generate the final value for the whole expression. 


The Assignment Statement 


An assignment statement assigns the value of the expression to the right of 
the equals sign, to the variable appearing to the left of the equals sign. A 
typical arithmetic assignment statement would look like: 





whole = parti + part2 + part3) 


In this statement, the whole is exactly the sum of its parts, and no more. 
However, recalling the odd behavior of the increment and decrement 
operators, we could write: 


whole = (parti-) + (part2-) + (parti) | 


After the execution of this statement the variable whole will be three more 
than the sum of the variables parti, part2 and part3, since each of these 
will be decremented after the overall expression on the right of the equals 
sign. 


Multiple Assignments 


You can also write repeated assignments such as: 


A*8^41; 
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This is equivalent to assigning the value 1 to B, then assigning the value of 
B tO A. 


An Arithmetic Exercise 


We can exercise basic arithmetic in C, along with a few of the other things 
we have covered so far in this chapter, by calculating how many standard 
rolls of wallpaper are needed to paper a room. This is done with the 
following example: 





Arithmetic 
Operators 





Program Analysis 


One thing needs to be clear from the outset. No responsibility is assumed 
for you running out of wallpaper as a result of using this program. All 
errors in the estimate of the number of rolls required are due to the way C 
works, as we shall soon see, and due to the wastage that inevitably occurs 
when you hang your own wallpaper - usually 50963. 


We have a block of declarations for the variables used in the program right 
at the beginning of the body of main(). These statements should be fairly 
familiar by now. Two of them define constant variables: 





Because they have been declared as constants, the compiler can check that 
they are used properly, and in particular, it will complain about any 
attempts to change their values. 


Note that the variable names declared as const are written here with 
capital letters. This is a common convention to distinguish them from 
variables. It can be very useful defining constants by means of const 
variable types, particularly when you use the same constant many times. 


Constant Expressions 


The const variable ROLLLENGTH is also initialized with an arithmetic 
expression (12.*33.). Being able to use a constant expression as an 
initializer saves having to work out the value yourself, and can also be a 
lot more meaningful, since 33 feet multiplied by 12 inches is much clearer 
than simply writing 396. The compiler will generally evaluate constant 
expressions accurately, whereas if you do it yourself, depending on the 
complexity of the expression and your ability to number crunch, there's a 
possibility that you may be wrong. 
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You can use any expression that can be calculated as a constant at compile 
time, including const objects you've already defined. So for instance we 
could declare the area of a standard roll of wallpaper as: 





This statement would obviously need to be placed after the declarations for 
the two const variables used in the initialization of ROLLAREA. 


Reading Floating Point Values 


The next four statements in the program handle the user input: 





We've used print£() to display prompts, and then we use the function 
scanf() to input the height, length and width. In a practical program we 
would need to check for errors, and possibly make sure that the values 
entered are sensible, but we will look at that a little later. 


Note the format specification, %1£, for reading a variable of type double. 
Here we don't have the option of using a capital 'L' in the format specifier. 
The specification %Lf is for reading values of type long double. To read a 
value of type float you would just use plain old %f. 


Notice that the names of the variables in the argument list to scanf() are 
prefixed by an ampersand. Exactly why this is we'll discover later, but for 
the meantime, just make sure that you use an '&' with variable names in 
scanf (). 


Calculating the Result 


We have four statements involved in calculating the number of standard 
rolls of wallpaper required for the size of room given. First we calculate the 
number of room-height strips we can get out of a single roll: 





Note that the result is stored as an integer, which will mean that the result 
is rounded down, effectively discarding partial strips, which is actually what 


op= 


Assignment 





we want. For example, if the room is 8 feet (96 inches) high, we divide 96 
into 396, giving us a result of 4.125 - four strips with a little left over. 


The perimeter of the room is equal to twice the sum of the length and 
breadth, parentheses being used to make sure that the addition is carried 
out before the multiplication: 





The last arithmetic statement calculates the number of rolls required, by 
dividing the number of strips required (integer), by the number of strips in 
a roll (also integer): 








Because we are dividing one integer by another, the result has to be an 
integer and any remainder is ignored. The result we obtain is essentially the 
same as if we produced a floating point result and rounded down to the 
nearest integer. This isn't really what we want, so you'll need to fix this 
too, if you want to use this program in practice. 


As a rule you should only use the floating point types when 
you need to. If you have to use floating point types to hold 
integer values, you must not rely on their values being exact. 
In floating point, 0.9999999 is as good as 1.0 and in most 
instances it will make no difference. However, if you round 
down to an integer, it isn't one at all, it’s zero. 


























The op= Version of Assignment 


It's often necessary to modify the existing value of a variable, such as 
incrementing or doubling it. We could increment a variable count using the 
statement: 





This simply adds five to the current value stored in count, and stores the 
result back in count, so if count started out at 10, then it would end up as 
15. In addition, you also have an alternative shorthand method of writing 
the same thing in C: 
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TT 


We can also use other operators with this same notation as well. For 
example: 


GL MALL a d 


has the effect of multiplying the current value of count by 5, and storing 
the result back in count. 


Operators and Syntax 


In general we can write statements of the form (where 1hs and rhs stand 
for left-hand and right-hand side respectively): 


rhs op» lhs; 
where op is any one of the following operators: 
+ A * / 9 & ^ | «« >> 
The general form of the statement is equivalent to: 
rhs = rhs op ( lhs ); 
This means that we can write statements such as: 


a oe 


which will in effect be identical to: 


Kin OLN ES 


Bitwise Operations 


The bitwise operators treat their operands as a series of individual bits 
rather than a numerical value. They only work with integer variables or 
constants, so only the data types short, int, long, and char can be used. 
They are particularly useful in programming hardware devices, where the 
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status of a device is often represented as a series of individual flags or bits 
in a word. 


There are six bitwise operators: 


& bitwise AND 

| bitwise OR 

^ bitwise exclusive OR 
- bitwise NOT 

>> Shift right 

«« shift left 


We won't be discussing the shift operators here since they are rarely used, 
but let's take a look instead at how each of the first four work. 


The Bitwise AND 


The bitwise AND, &, is a binary operator that combines corresponding bits 
in its operands. We can represent this in a table, often referred to as a truth 
table: 





Lets see how this works in an example. 


A Simple Example 


Bitwise operators are commonly used to test the properties (or attributes) of 
files. For example if a certain file had a bit field in the format ‘00010010’, 
representing its attributes, we could perform a variety of operations. 
Supposing that we wanted to check whether this file is writable, where the 
writable attribute is represented in bit 2. 
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If nResult becomes 0 then the specified file isn't writable. You can confirm 
this by looking at how corresponding bits combine with & in the truth table. 


Masking 


Because the & produces zero if either bit is zero, we can use this operator 
to make sure any unwanted bits are zero. We can achieve this by creating 
what is called a mask, which is combined with the original variable using 
&. We create the mask by putting 1’s where we want to keep bits, and 0's 
where we don't. The result will be 0's where the mask bit is 0, and the 
same value as the original bit in the variable where the mask is 1. 


Suppose that we have a char variable netter where, for the purposes of 
illustration, we want to eliminate the 4 high order bits, but keep the 4 
low order bits. This can easily be achieved by setting up a mask as 0x0F 
(00001111 in binary), and combining them using &, like thus: 





If Letter started out as 0x41, then it would end up as 0x01 as a result. 


The Bitwise OR 


The bitwise OR, |, is sometimes called the inclusive OR. The truth table for 
the bitwise OR is: 





The OR can be used to turn bits on. If we want to be sure that the fifth bit 
ina char variable flag is on, but we want to leave the other bits alone 
then we could use the statement: 
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The value ORed with flag is 0000 1000 in binary, which forces the fifth bit 
to be 1, and leaves the others as they were. 


The Bitwise Exclusive OR 


The exclusive OR (abbreviated to EOR or XOR), ^, is so called because its 
operates in a similar way to the inclusive OR but produces 0 when both 
operand bits are 1. Its truth table is therefore: 





The ^ operator has a rather surprising property. Suppose that we have 
two char variables, First with the value ‘a’, and Last with the value 
'Z', corresponding to the binary values 0100 0001 and 0101 1010. If we 
write the statements: 






1 1011 */ " 
00 0001 */ 
0101 1010 */ 






then the results show that First and Last have exchanged values without 
using any intermediate memory location. This also works with any integer 
values. 


The Bitwise NOT 


The bitwise NOT, -, takes a single operand for which it inverts the bits. 
Thus, take the following statement: 


If Letter1 is 0100 0001, then Result will have the value 1011 1110, which 
is OxBE or 190 as a decimal value. 
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Variable Types and Casting 


In spite of the impression you may have gained so far, calculations in C, or 
any other programming language for that matter, can only be carried out 
between values of the same type. Your computer can't work any other way. 


To override this, when you write an expression involving variables or 
constants of different types, you must tell the compiler to convert one of 
the types of operands to match that of the other. This conversion process is 
called casting. For example if you want to add a double value to an 
integer, then the integer value is first converted to double, and then the 
addition is carried out. Of course the variable which contains the value 
which must be cast isn't changed. The compiler will store the converted 
value in a temporary memory location which will be discarded when the 
calculation is finished. 


JU CHF zh T 
SER 
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Conversion Rules 
There are rules that govern which operand is selected to be converted in 


any operation. 


Any expression to be calculated can be broken down into a series of 
operations between two operands. For example, the expression 2*3-4+5 
amounts to the series: 


"WE 2+3 resulting in 6 
"WE 6-4 resulting in 2 


ESSA neis ea n adio, et Eo elec Ni nh vd gta SS ngos IESUS eoi CS Gc eee fe Vs Td 


a "WEMP and finally 2+5 resulting in 7 

1 Thus, the rules for casting operands where necessary only needs to be 

E defined in terms of decisions about pairs of operands. So for any pair of 
4 operands, the following rules are checked in the sequence that they are 


1 written until one applies, and then that rule is used: 


VIDE EA PRU E E A 


A EET 


AAA 
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A Simple Example 


We could try these rules on a hypothetical expression to see how they work. 
Let's suppose that we have a sequence of variable declarations as follows: 
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Let's also suppose that we have the following statement: 





We can now work out what casts the compiler will apply. 


1 The first operation is to calculate (value - count). Rule 1 doesn't 
apply but Rule 2 does, so the value of count is converted to double 
and the double result, 15.0 is calculated. 


2 Next (count + num) must be evaluated, and here the first rule in 
sequence which applies is Rule 5, so num is converted from char to 
int and the result 12 produced as type int. 


3 The next calculation is the product of the first two results, a double 
15.0 and an int 12. Rule 2 applies here and the 12 is converted to 
12.0 as double, and the double result 180.0 is produced. 


4 This result now has to be divided by many, so Rule 2 applies again 
and the value of many is converted to double before generating the 
double result 90.0. 


5 The expression num/many is calculated next, and here Rule 3 applies 
to produce the float value 2.0£ after converting the value of num 
from char to float. 


6 Lastly the double value 90.0 is added to the float value 2.0£ for 
which Rule 2 applies, so after converting the 2.0f to 20 as double, 
the final result of 92.0 is stored in value. 


Casts in Assignment Statements 


As we saw in example EX2-03.C, you can cause an implicit cast by writing 
an expression of the right hand side of an assignment that is of a different 
type to the variable on the left hand side. This can cause values to be 
changed and information to be lost. For instance, if you assign a float or à 
double value to an int ora long variable, then the fractional part will be 
lost and just the integer will be stored, assuming that it doesn't exceed the 
range of values available for the integer type concerned. 


For example, after executing the following code fragment: 


Explicit 
Casting 





the value of number will be 2. Any constant containing a decimal point is 
floating point, and if you don’t want it to be double precision then you 
need to append the f. A capital F would do just as well. We can also 
define long integer constants by appending an 1, or better still an upper 
case L to avoid confusion, to the integer value. Thus, 99 is a 2-byte integer, 
and 99L is a 4-byte integer. 


Explicit Casting 


Sometimes though, the default cast rules can be inconvenient. Suppose you 
have an expression: 





welt exel/i, 2 


where x is double and i and j are integers. Because of the way integer 
division works, you won't get an exact result here unless i is a multiple of 
j. The variable i will be divided by 4 and any fractional part in the result 
will be discarded. 


You can use an explicit type cast to convert a value from one type to 
another. We can rewrite the last statement as: 





The (double) in the right hand side expression causes i to be converted to 
a double. As a result, the value of 4 must also be converted to type 
double before the division occurs, so we now get an exact result. 


Syntax of Explicit Casting 
In general: 
( type )expression 


causes the value of expression to be converted to type, before the value is 
used further. 
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Operator Precedence 


Operator precedence orders the operators in a priority sequence. Operators 
with the highest precedence are always executed before operators of a lower 
precedence. The precedence of the operators in C is shown in the following 
table: 








Precedence Operators Associativity 













Function call Left to right 





Array indexing 


=> 


! ~ ++ - Right to left 










The unary operators + - & * 










(typecast) sizeof 
















* / % Left to right 
+ - Left to right 
<< >> Left to right 
< <= > >= Left to right 
== T Left to right 
& Left to right 
^ Left to right 


| Left to right 
hk Left to right 
| | Left to right 
?:(conditional operator) Right to left 
Right to left 


comma operator 


Left to right 





Variable 








Scope 





Here, operators in the same row are all of equal precedence. If there aren't 
any parentheses in an expression, then operators of equal precedence are 
executed in a sequence determined by their associativity. Thus, if the 
associativity is “left to right’, then the leftmost operator in an expression is 
executed first, progressing through the expression to the rightmost. There 
are a few operators you haven't seen yet, but you'll know most of them by 
the end of this book. 


Note that where an operator has a unary (working with one operand) and a 
binary (working with two operands) form, the unary form is always of a 
higher precedence and is therefore executed first. The unary + and - apply 
to constants or expressions, as in -1.234, or -(A+B) for example. We will 
see the unary * in Chapter 4. 


Rather than spend hours with your family and friends doing memory tests 
to remember the precedence table, you can always override them using 
parentheses. Since there are so many operators in C, sometimes it can be 
hard to be sure what actually takes precedence over what. In such cases it’s 
a good idea to insert parentheses just to make sure. A further plus is that 
parentheses often make the code easier to read. 


Variable Scope 


The range of a program where you can use a certain variable is called the 

variable’s scope. All variables are limited in scope. They come into existence 
from the place where you define them, and then at some point, when your 
program terminates, they disappear. Obviously they can only be used while 
they exist, and for some variables their scope is more limited than this. 


Automatic Variables 


All of the variables that we’ve declared thus far have been declared within 
a block, that is within the extent of a pair of curly braces. These are called 
automatic variables, and are said to have local or block scope. They are born 
when they are declared and they automatically cease to exist at the end of 
the block containing their declaration. You can declare variables at the 
beginning of any block, immediately after the opening brace. 
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An Example to Demonstrate Variable Scope 


We can demonstrate how automatic variables behave with this example: 





Program Analysis 


The output from this example will be: 


Value of outer count1 = 10 
Value of inner counti = 20 
Value of outer count! = 10 
Value of outer count3 = 80 


Two variables are declared at the start of the main routine; then a new 
block is started, and two new local variables are declared, one of which 
hides the original counti. When that block ends, the local variables no 
longer exist, and you can see that changes made within the block haven't 
affected the value of the original count1. 
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Global 








Variables 





The output statement shows by the value in the second line that we're 
using the count1 in the inner block. The variable count1 is incremented, 
and the increment applies to the variable in the inner block since the outer 
one is still hidden. However, count3, which was defined in the outer block, 
is incremented without any problem, showing that the variables defined at 
the beginning of the outer block are accessible within the inner block. 


After the closing brace, count2 and the inner counti cease to exist. The 
variables counti and counts are still there in the outer block, and the 
values displayed show that count3 was indeed incremented in the inner 
block. If you de-comment the line: 


¡4% printf('NaValue of dounta 9 Wd", asuata WF 000008 Oo Oo Ue 


the program will no longer compile correctly because it attempts to output 
a non-existent variable. You should get some kind of error message 
indicating that count2 is undefined at this point. 


Global Variables 


Variables declared outside of all blocks and functions are called global 
variables or globals, and have file scope. This means that they're accessible 
throughout all the functions in the program file, after the point where they 
were declared. If you declare them at the very beginning, then they will be 
accessible throughout the file. 


Since global variables are declared outside of all the blocks in a program, 
they continue to exist as long as the program is running. This might raise 
the question in your mind, ‘why not make all variables global and avoid all 
this messing about with local variables that disappear?’. This sounds like a 
very attractive proposition at first, but like the Sirens of mythology they 
bring serious disadvantages with them that completely outweigh any 
advantages that you might gain. 


Real programs are generally composed of a large number of statements, a 
significant number of functions, and a great many variables. Declaring all 
variables of global scope greatly magnifies the possibility of accidentally 
modifying a variable, as well as making the job of sensibly naming them a 
little difficult. By keeping variables local to a function or a block, they have 
almost complete protection from external effects, and the whole development 
process becomes much easier to manage. 
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There are also significant advantages in memory management when you use 

1 local variables. The memory occupied by a local variable is automatically 

made available for other purposes at the end of the block where the local 
variable is defined. This allows the same memory area to be used for many 
different purposes during the execution of a program. 


Static Variables 


It’s quite conceivable that you might want to have a variable that's defined 
and accessible locally, but continues to exist after exiting the block in which 
it is declared. This will become more apparent when we come to deal with 
functions specifically in Chapter 5. The static specifier provides this. To 
declare a static variable count you would write: 





Although a variable declared as static will continue to exist for the 
duration of a program, if it's declared within a block, its scope will be 
limited to just that block. Static variables, that retain their value during 
subsequent visits, are always initialized to zero if you don't provide an 
initializer yourself. The variable count declared here will be initialized 
with 0. 





Register Variables 


You can indicate that you want a variable to be placed in a register in your 
computer, rather than in conventional memory. Operations using registers 
are faster than using conventional memory. They are integer variables, and 
only 16 are allowed on a PC. They are defined using the register 
keyword. For example, the declaration: 








declares an integer variable number, and requests that it be placed in a 
register. However, the compiler reserves the right not to put your variable 
in a register if it doesn't have one available. 





This kind of declaration isn't used very frequently because the limited 
availability of registers in most contexts means that it doesn't usually 
produce a substantial improvement in the performance of a program. 
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Defining Your Own Data Type 
Names 


If you don’t like standard type names such as long or double, you can 
change them using the typedef keyword. For example, if you wanted to 
use BigOne instead of long, you could define Bigone as an alternative, 

with the statement: 

typedef long BigOne; /* Define BigOne as equivalent to long */ 


You can now use Bigone just as you would have used long. For example 
we can now declare a variable Number as type long with the statement: 


BigOne Number; 
Of course, you can still use the 1ong keyword in your program too. 


This may seem quite trivial at this point, since we aren't defining a new 
type, just an alias for an existing type. We will see later that this can 
become quite an asset in two contexts; in providing a simple means of 
expressing a complex type, and as an aid to portability where the meaning 
of integer types can differ on various kinds of computer. 


Summary 


In this chapter you've learnt all the basic types of data that you can handle 
in C, and almost all of the operations that you can carry out on them. The 
only ones missing are those associated with comparing and testing values, 
and we will get to those in the next chapter. You should now feel 
comfortable with writing a program consisting of a function main(), and 
using all of the operators we have discussed so far. 


In the next chapter we will take a giant leap forward, since we will add 
decision making capability to the computational skills we have just gained. 
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Programming Exercises 


1 Write a program to convert a Fahrenheit temperature value read from 
the keyboard to Centigrade, and display the result. The formula to do 
this is: 


Centigrade = 5*(Fahrenheit - 32 )/9 


Try to do this with integers first. You should be able to get a result to 
the nearest degree. Then try it with floating point values. 


2 Write a program to read in values of each of the types you've learnt 
about, and then display them. Find out what happens when you have 
the following sorts of errors: 


The format specifier doesn't match the type of value being 
displayed. 

You omit the & in front of the name of a variable in a 
scanf() argument. 


3 Write a program to allow a capital letter to be entered and display the 
letter and its sequence number in the alphabet - A is 1, B is 2 and so 
on. See what happens when characters other than letters are entered. 


4 Write a program to read in a long integer value, and then: 


Alter the rightmost 8 bits of the value to 1, and display the 
result as an integer. 

Change the 1 bits to 0 and vice versa, displaying the result 
as an integer. 

Change the 1 bits to 0 and vice versa, add 1, and display 
the result. 

Run the program with both positive and negative numbers 
as input. 
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Logic and Loops 





In the last chapter we saw how to calculate in C. In this chapter we are 
going to add the potential for intelligence in a program. The language 
elements in this chapter, on top of those we saw in the last, provide the 
potential for writing chess playing programs, programs to predict the 
weather or possibly even the result of the next presidential election. All you 
have to do is figure out how. 


By the end of this chapter you will have learnt: 


How to compare values, and affect the sequence of execution based 
on the result of a comparison. 


How to assemble multiple comparisons into a single logical 
expression for decision-making purposes. 


What statements are provided in C for repeating one or more 
statements until a given condition is satisfied, and how to apply 
them. 


Some additional capabilities for reading input from the keyboard and 
writing output to the screen. 
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Making Decisions 


The ability to compare values and alter the course of a program based on 
the result is what gives your computer the power to solve problems rather 
than just being a big calculator. There are two aspects to decision making 
in your program: the means of making comparisons between items of 
data, and the program statements that alter the sequence of execution 
based on the result. We will start by comparing data values. 


Relational Operators 


To determine how one item of data compares to another we're going to 
use relational operators. Since character information is ultimately 
represented by numeric codes, we're always dealing with numbers in one 
way or another, so comparing numerical values is integral to all decision 
making. We have available to us six operators for comparing two values: 


« less than <= less than or equal to 
> greater than >= greater than or equal to 
== equal to | = not equal 


The == Operator 


The ‘equal to’ comparison operator has two successive equal signs. This is 
because a single equals sign is treated as the assignment operator. You will 
find that using one equals sign instead of two is a common mistake that 
most C programmers make. Bear this in mind because it won't necessarily 
cause a compiler error, but your program will behave rather differently than 
intended. 






Remember, you use == when you are asking whether two 
variables are exactly the same. You use = when you are telling 
them to be the same. 


If you do find that you are forever typing a single equals sign instead of a 
pair, then try putting the constant, assuming that there is one, first. For 
example: 





Making 


Decisions 





if (x == 10) /* The right version */ 

if (x = 10) /* Wrong, but valid code; the compiler 
won't complain, and will always 
evaluate to true! */ 

if (10 = x) /* Wrong, and illegal; the compiler 
will complain */ 


True and False 


When we make a decision in C, one action is taken if an expression is True 
(represented by 1), and another action is taken if an expression is False 
(represented by 0). In fact, this isn't strictly accurate, since in C, any non- 
zero integer, including -1, will be interpreted as True for decision-making 
purposes. We can see how this works by having a look at a few simple 
examples of comparisons. Let's assume that we define the following 
variables: 





We can now write some examples of comparing values. Take a look at the 
following expressions and their equivalent values: 


Expression Value 


First == 65 True 
First < Last True 
‘E’ <= First False 
First != Last True 


-1 < y True. The variable y has a very small negative 
value -0.000000000025, and so it's greater 
than - 1. 


False 


True. The expression 3 + y is slightly less 
than 3, and 2.0*x is exactly 3. 
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We can use the relational operators to compare values of any of the basic 
data types, so all we need now is a practical way of using the results of a 
comparison to modify the behavior of a program. 


The if Statement 


The if statement allows you to execute a single statement or a block of 
statements enclosed within curly braces if a given expression results in the 
value True. If the expression results in 0, which is False, then the statement 
or block isn't executed. This is illustrated here: 


if( Expression ) 
LoopStatement; 
NextStatement; 





A simple example of an if statement is: 





The condition to be tested appears in parentheses immediately following the 
keyword, if. Note the position of the semi-colon here. It appears after the 
statement following the if, not after the condition in parentheses. You can 
also see how the statement following the if is indented, to indicate that it's 
associated with the if. 
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if and else 





The output statement will only be executed if the variable netter has the | 
value *A'. We could extend this example to change the value of retter, if . 
it contains the value ‘a’: | 





Here, if the condition is True, we execute these statements in the block. ; 
Without the braces, only the first statement would be the subject of the if, 1 
and the statement assigning the value ‘a’ to Letter would be executed, i 
irrespective of the condition. Note that there is only a semi-colon after each 
of the statements in the block, not after the closing brace at the end of the 
block. Now, as a result of Letter having the value ʻa’, we change its 
value to ‘a’ after outputting the same message as before. If the condition is : 
False, there will be no message and the values aren't changed. 


The else Keyword 


The if statement we have so far used executes a statement if the ! 
expression specified results in the value True. Program execution then E 
continues with the next statement. We also have an extended version of the 

if statement which allows one statement to be executed if the result is 

True, and another if the result is False. Execution then continues with the 

next statement after the two choices: 
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Note that number*2 returns True, equivalent to number%2 == 1. 


The if-else combination provides a choice between two options: 


if( Expression ) 
Statement. 1; 

else 
Statement. 2; 

NextStatement; 





Nesting if-else Statements 


You can nest if statements within if statements, if-else statements within 
ifs, ifs within if-else statements and if-else statements within if-else 
statements. This gives us plenty of room for serious confusion, so let's look 
at these with a few examples. 


An Example of if-else Nesting 


Taking the second case, an example of a nested if-else within an if might 
be: 
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This assumes that we have a variable coffee, which has the value 'y 
when there is coffee, and a variable donuts, indicating the similar presence 
(or absence) of donuts. The test for donuts is executed if the result of the 
test for coffee is True, so the messages reflect the correct situation in each 
case. However, it's easy to get this confused - if we write much the same 
thing with incorrect indentation we can be trapped into the wrong 
conclusion: 


if( coffee == y! ) — 

| | . Printf("inWe have coffee and donuts."); ur | 

QUNM | o S . /* This else is indented incorrectly */ a 
(s printf( "\nWe have no coffee.. "pu s 5 /* Wrong! */ 


This mistake is easy to see here. In spite of how it looks, the eise doesn't 
belong to the first if; it will only be executed if there is coffee and there 
aren't any donuts, so when it's executed the message will always be False. 


The if-else Ownership Rule 


Whenever things look a bit complicated, you can apply the following rule to 
sort things out: 











When you are looking at nested ifs, you need to bear in mind 
the rule about which if owns which else; an else always 
belongs to the nearest preceding if that isn't already spoken 
for by another else. 





— 








When you're writing your own programs, you can always use braces to 
make the situation clearer. 


It's not really necessary in such a simple case, but we could write the last 
example as: 


dé(coffee ey) 0000000 


| printf("VnWe have coffee and donuts."); - 


(o Pprintf("AnWe have coffee at least..."); 
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Now that we know the rules, understanding the case of the if within the 
if-else becomes easy: 





Here the braces are essential. If we leave them out, then the else would 

belong to the if which is looking out for donuts. In this kind of situation 
it's easy to forget to include them and create a logic error, which may be 
quite hard to locate. 


The last case can get very messy even with just one level of nesting. Coffee 
and donuts are always welcome, so let's have some more: 





This is starting to look slightly muddled. As the rule will verify that this is 
correct, no braces are necessary, but having them makes things clearer: 
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Logical 


Operators 





If you combine enough nested ifs, you can almost guarantee a mistake 
somewhere. 


Logical Operators 


As we have just seen, where we have two or more related conditions, using 
ifs can be a little cumbersome. We have tried our ‘iffy’ talents on the 
important question of whether there is coffee and donuts, but in practice 
you may want to check more complex conditions. You could be searching a 
personnel file for someone who is over 21 but under 35, is female with a 
college degree but not in psychology, and who is unmarried and fluent in 
Quechua or Waica. Defining a test for this could involve an if to make 
your eyes water. 


Logical operators provide a neat and simple solution. We can combine a 
series of comparisons using logical operators within a single expression, 
ending up with just one if, virtually regardless of the complexity of the 
conditions. We have just three logical operators at our disposal: 


&& logical AND 
|| logical OR 
! logical negation (NOT) 


Let's consider how each of these are used. 


AND 


You would use the logical AND operator, &&, where you have two 
conditions and you want both to be True, giving a True result. This is the 
case when testing for upper case. For example, the value being tested must 
be greater than or equal to ‘a’, and less than or equal to ‘z’. If either or 
both conditions aren't True, then the value isn't a capital letter. If we take 
the example of a value stored in a char variable Letter, we could write 
the test that originally used two ifs as a single if: 





Here the output statement will only be executed if both of the conditions 
combined by the operator && are True. The effect of logical operators is 
often shown using a truth table, just as we did in the previous chapter. 
Please refer back to those if you need a reference. 
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OR 


The OR operator, ||, applies when you have two conditions and you want 
a True result if either (or both) of them are True. For example, you might 
be considered creditworthy for a loan from the bank if your income was at 
least $100,000 a year, or if you had $1,000,000 in cash. This could be tested 
using the following if: 





The response emerges when either or both of the conditions are True. A 
better response might be "Why do you want to borrow?". 


NOT 


The third logical operator, !, takes one operand with a logical value, True 
or False, and inverts its value. So if the value of Test is True, then !Test 
becomes False. For example, if x has the value 10, then 


il(x»5) 


is False, since x » 5 is True. 


Using Several Logical Operators 


You can combine conditional expressions and logical operators to the degree 
that you feel comfortable with. For example, we could construct a test for 
whether a variable contained a letter, just using a single if. 


An Example 


Let's write it as a working example: 





Logical 











Operators 








H (tanterna A TAE CH tee R1) n ((Letter>='a" Jaa (Letterc='z!)) i 
Nor ee NÉE entered Men) .. o a 
alae nu * d a 
T printf ("\ayou didn’t enter a deter 
ais sa a 









Program Analysis 


The interesting part of this program is in the if statement condition, 
consisting of two logical expressions combined with the OR operator. 


4£( ((Letter»-'A')&&(Letter«s'z')) || ((Letter»-'a')&&(Letterce'z!)) ) ——— 


Each combines a pair of comparisons with the operator AND, so both must 
be True if the logical expression combining them is to be True. The first 
logical expression is True if the input is upper case, and the second is True 
if the input is lower case. 


The Conditional Operator 


The conditional operator, sometimes called the ternary operator, enables you 
to choose to execute one of two expressions, depending on the value of a 
True or False condition. 


Syntax of the Conditional Operator 


The conditional operator can generally be written as: 
condition ?  expressionl :  expression2 


If condition evaluates as True, then the result is the value of expression1, 
and if it evaluates to False, then the result is the value of expression2. It’s 
best understood by looking at an example. 


A Conditional Operator Example 


Suppose we have two variables, a and b, and we want to assign the 
maximum value between them to a third variable, c. We can do this with 
the statement: 


ewwbtfa:b) . d /* set o to the maximum of a and b */ 
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The conditional operator has a logical expression as its first argument, in 
this case a very simple one, a»b. If this expression has the value 1 (True), 
then the second operand, a in this case, is returned as a value. If it's False, 
then the third operand, b in this case, is returned instead. Thus the result of 
the conditional expression is a, if a is greater than b, and b, if a is less 
than b. This value is stored in c. The equivalent if statement is: 





However, the conditional operator doesn't give you the same flexibility as 
an if. It will only allow you to choose between two different expressions. If 
you need to do several things with either choice, then the way to go is 

with an if statement. 


The switch Statement 


The switch statement enables you to select from a number of choices based 
on a fixed set of values for a given expression. We can examine how the 
switch statement works with the following example of a program that was 
an early contender for the home market, but didn't sell very well: 








Program Analysis 


The 


The first printf() statement introduces something new - the output of 
several lines of text, all in one batch: 





Each line appears between double quotes, and there is no comma separating 
one from the next. When you write two or more strings between quotes 
with just whitespace separating them, they will be treated as though they 
were one long string by an ANSI C compiler. This allows you to space 
them out in a readable fashion, without having to write multiple printf() 
statements. Note how a wt (a tab character) is used to indent the choices 
when they are displayed. 


The switch keyword is followed by a test condition in parentheses, which 
must evaluate to an integer value. The possible choices in the switch are 
identified by case labels, whose expressions should match the expected 
values taken by the test condition. 


break Statement 


The statements to be executed for a particular case are written following 
the colon after the case label and are normally terminated by a break 
statement which transfers execution to the statement after the switch. The 
break isn't mandatory, but it stops the switch from continuing down the list 
of cases. 
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Note that breaking out of the normal flow of a program is considered to be 
bad programming style. It is best to try and avoid using break in every 
circumstance, and only use it when you deem it necessary. 


The Flow of a switch Statement 


If the value of choice doesn't correspond with any of the case values 
specified, then the statements preceded by the non-mandatory default label 
are executed. In its absence the switch is exited and the program continues 
with the next statement after the switch: 


switch( Expression ) 


case Value. 1: 
Statement 1; 
break; 

case Value 2: 
Statement. 2; 
break; 

case Value. n: 
Statement n; 
break; 

default: 
DefStatement; 


} 
NextStatement; 
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Each of the case constant expressions must be constant and unique. If two 
case constants are the same, the compiler would have no way of knowing 
which should be executed for that value. 


haring case Actions 


However, different cases don’t need to have a unique action; several cases 
can share the same action, as is shown in the following example: 





rogram Analysis 


In this example, we have a more complex expression in the switch. If the 
given character isn't a lower case letter then the expression: 


( Letter»-z'a' && Letter <='z' ) 


will result in the value 0. This will then cause the statements following the 
label case 0 to be executed. As long as the character entered is lower 
case, the variable Letter will be multiplied by 1, and will retain its original 
value. 
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If a lower case letter is entered for all values corresponding to vowels, the 
same output statement is executed. This is achieved by writing each of the 
case labels corresponding to the five vowels one after the other, before the 
statements to be executed. If a lower case consonant is entered, since there 
are no case labels corresponding to this situation, the default label 
statement is executed. 


The goto Statement 


The if statement provides you with the flexibility to choose to execute one 
set of statements or another depending on a specified condition, so the 
statement execution sequence is varied depending on data values in the 
program. In contrast, the goto statement is a blunt instrument, providing 
the possibility to branch to a specified program statement, unconditionally. 


The statement to be branched to must be identified by a statement label. 
These identifiers are defined according to the same rules as a variable name, 
but are distinct from the labels of other entities. The statement label is 
followed by a colon and placed before the statement requiring labeling. Here 
is an example: 


This statement has the label MyLabel, and an unconditional branch to this 
position, in the same function of the program, would be written as: 


Using gotos (like breaks) in your program should be avoided as much as 
possible. They tend to encourage very convoluted code that can be 
extremely difficult to follow, and if you ever get the program working it 
can become a nightmare to maintain. As the goto is theoretically 
unnecessary there is always an alternative approach, and a significant cadre 
of programmers say that it should never be used. 





Repeating a Block of Statements 


The ability to repeat a group of statements is fundamental to most 
applications. This programming mechanism is called a loop. Without loops, 
an organization would need to modify the payroll program every time an 
extra employee was hired. Without loops, you would need to restart your 
Word processor every time you wanted to open another document. So let's 
first understand how a loop works. 


What is a Loop? 


A loop is basically the execution of a sequence of statements until a 
particular condition is True (or False). We can actually write a loop with the 
C statements we have met so far. We just need an if and the dreaded 
goto. 


An Example of a Loop 


For example: 
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Program Analysis 


This program accumulates the sum of integers from 1 to MAX, where MAX 
has been initialized to 10. The first time through the sequence of statements 
beginning with the statement label loop, i is 1 and it's added to sum which 
is zero. In the if, i is incremented by 1, and as long as it's less than or 
equal to max, then the unconditional branch to loop occurs. The cycle 
begins again and the value of i, now 2, is added to sum. This continues 
with i being incremented and added to sum each time until finally, i is 
incremented to 11 in the if, and the branch back won't be executed. If you 
run this example, you will get the output: 


sum = 55 
i= 11 


This quite clearly shows how the loop works, but it has two serious 
disadvantages. It uses a goto, and it introduces another label into our 
program, both of which we really should be avoiding. 


We can achieve the same thing and more with the next statement that we 
are going to have a look at, which is specifically used for writing a loop. 


Using the for Loop 


The for loop works in a way that looks like an analog of the loop we 
created using an if anda goto in the last example. 


Syntax of the for Loop 


The general form of the for loop is: 


for( initializing expression ; test expression ; 
increment expression ) 
loop statement; 


Of course, loop statement can be a block of code between braces. The 
sequence of events in executing the for loop is shown here: 


for Loops 


for( Initialization ; Test ; Increment ) 
LoopStatement; 
NextStatement; 





An Example of the for Loop 


So let's get a preliminary understanding of how a for loop works by 
rewriting the last example to use it: 





Chapter 3 - Logic and Loops 





Program Analysis 


This program gives exactly the same output as the previous example, but 
has accomplished it without a label and using only two lines of code: 





The conditions determining the operation of the loop appear in the for 
statement, and there are three expressions that appear within the 
parentheses after the keyword for. The first sets i to 1 as the initial 
condition, the second specifies the condition that must be True in order to 
continue to loop, in this case as long as i is less than or equal to max, and 


the third is the action to be taken each time the loop executes. 





Any time you find yourself repeating something more than a 
couple of times, then it's worth considering a for loop. They 
will usually save you time and memory. 






Actually, this loop isn't exactly the same as the version in EX3-05.C - it can 
behave differently. You can demonstrate this if you set the value of max to 0 
in both programs and run them again. You will find that the value of sum 
is 1 in EX3-05.c, and 0 in the for loop version, and the value of i differs 
too. The reason for this is that the if version of the program always 
executes the loop at least once, since the condition isn't checked until the 
end. The for loop doesn't, so the condition is evidently checked at the 
beginning. 


The Infinite for Loop 


If you omit the test condition, the value is assumed to be True, so the loop 
will continue indefinitely unless you provide some other means of exiting 
from it. In fact, if you like, you can omit all the expressions in the 


Infinite for 


Loops 





parentheses after the for. This may not seem very useful, but in fact quite 
the reverse is true. Have a look at the following example: 





Program Analysis 


This program will compute the average of an arbitrary number of values. 
After each value is entered, you need to indicate if you want to enter 
another value. 


Typical output is: 


Enter a value: 10 
Do you want to enter another value ( enter n to end )? y 


Enter a value: 20 
Do you want to enter another value ( enter n to end )? y 
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Enter a value: 30 
Do you want to enter another value ( enter n to end )? n 


The average of the 3 values you entered is 20.000000 


After declaring and initializing the variables we need, we start a for loop 
with no expressions specified, so there's no provision for ending it here. The 
block immediately following is the subject of the loop which is to be 
repeated, and if the program is ever to end, a means of ending the loop 
must appear in this block (note the ‘break’ statement, which is used to exit 
from the loop if the user decides not to continue). 


Infinite loops are not considered to be very good practice - they are 
dangerous, because of the restrictive nature of their structure. Unless you 
include a suitable number of good opt-out clauses, then a user of your 
software (which may unwittingly be yourself) would be unable to exit from 
the loop, and thus unable to leave your program. For now though, because 
you're just in the cocoon stage of C programming you should be aware of 
such tricks and all their dangers. 


Input/Output Tips 


You should compare the format specifiers used in the last example for 
inputting a double value using scanf(), and outputting a double value 
using printf(). On input, *f specifies that you want to read a float 
value, and *1£f specifies that you're inputting a value into a double 
variable. It’s important to use the correct specifier, otherwise the value 
stored will be incorrect and may actually overwrite important parts of 
memory. 


So how do we get more digits displayed if we're dealing with a double 
value? First of all, we can specify a field width for the output value in a 
similar manner to the way integer output is used, so *e15f specifies a field 
width of 15. The default number of output digits after the decimal point is 
6, but you can increase this by specifying a precision value after the field 
width, separated from it by a decimal point. So, to display a double value 
in a field width of 15, with 10 digits after the decimal point, you would 
use the specifier %15.10£. Try out a few variations of this example to get 
the feel of how field width and precision work when outputting floating 
point values. 














Please refer to Appendix A for more information on input and output 
formatting. 


The continue Statement 


Besides break,there is another statement that is used to affect the control 
of a loop - the continue statement, written simply as: 


continue; — — — 


Executing continue within a loop immediately starts the next loop iteration, 
skipping any remaining statements in the current one. 


A continue Example 
We can demonstrate this with the following code fragment: 


Amb i -0, value = Dee produc 
for( da - je E d <a 10 1 M 
Ww. ; 






| scanf I" " am, , seiner di 


ite IUE. =m o x 


qe Te Tic E is zero ay | 
continue; 


n skip to next iteration sl 








“product *- values 


This loop reads 10 values with the intention of producing the product of 
the values entered. The if checks whether the value entered was zero, and 


if it was, the continue statement skips to the next iteration. Obviously, if 
this occurred on the last iteration then the loop would end. 


Using continue 


The continue statement provides a very useful capability, particularly with 
complex loops, where you may need to skip to the end of the current 
iteration from different points in the loop. 
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The effect of the break and continue statements on the logic of a for loop 
is illustrated here: 


for( Initialization ; Test ; Increment ) 


if(Expression1) 
break; 
if(Expression2) 
continue; 


} 
NextStatement; 


continue 





Y 


The break and continue statements can also be used with other kinds of 
loop, which we'll be investigating in the next few sections. 
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Alternative Iteration Count Variables 


So far, we have only used integers to count loop iterations. You are in no 
way restricted as to which type of variable you use to count iterations. 
Look at the following example: 





4* EX3-8.C. Display ASCII codes. ‘for alphabetic characters */ E 
Moonde kioke o0 S ooo 0 T 
dat mino. du M cu o 






a alphabet, displaying each one in three 





formata D a“ a aE a hex eer ana a sona digit 










n\t %c%10 po ege ; Ac's10X*x 10d, a 
psal: o capital, small, small, mal | 





Program Analysis 


The way in which a value is displayed is determined entirely by the format 
specifier. A value is displayed as a character by using %c, preceded by four 
spaces to align it with the heading. A value is displayed as a hexadecimal 
value in a field 10 characters wide, with the specifier %10x. This will 
display hexadecimal digits with values from 10 to 15, as A to E If you 
prefer lower case ‘a’ and 'f', then you can use the specifier %10x. By 
default, the value will be right justified. If you wanted it to be left justified, 
you would use %-10x. The decimal versions of capital and small are 
displayed using the specifier %10a, which we have already seen. This is also 
in a field width of 10 characters. Remember that Appendix A summarizes 
how input and output is formatted. 


You can also use a floating point value as a loop counter. An example of a 
for loop with this kind of counter is as follows: 
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This calculates the value of a*x + b for values of x from 0.0 to 2.0 in steps 


of 0.25. 














Note that there are potential problems with using floating 
point values to control a loop. Decimal floating point numbers 
don't always have an exact binary representation. For example, 
if you start with x as zero and increment it repeatedly by 0.1, 
you may never reach the exact value of 1.0. You should 
therefore avoid checks for exact equality when using floating 
point values to control a loop, or you may end up with an 
infinite loop when you least expect it. 


The while Loop 


Now that we're expert for loop coders, let's look at a different kind of 
loop - the while loop. With a while loop, the mechanism for repeating a 
set of statements allows execution to continue as long as a specified 
expression has the value True. 


Syntax of the while Loop 


This loop will continue as long as the specified expression is True: 





The ioop statement will be constantly repeated as long as expression has 
the value True. Once the value becomes False, the program continues with 
the statement following straight after the loop. Of course, a block of 
statements between braces could replace the single loop statement. The logic 
of the while loop is represented in this diagram: 


while Loop 





TN 
D 
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while( Expression ) 
LoopStatement; — 
NextStatement; 





Execute _ 
LoopStatemer 















Execute 
NextStateme 





A while Loop Example 


We could rewrite our program to compute an average using the while form 
of loop: 


/* EX3-9.C Using a while loop to compute an average */ 


include <stdio.h> 


int main() 

{ | 
double value = 0.0; /* Value entered stored here */ 
double sum = 0.0; /* Total of values accumulated here */ 
int i = 0; /* Count of number of values */ 


char indicator = 'y'; /* Continue or not? */ 


while( indicator == 'y' ) /* Loop as long as y is entered */ 


{ 
printf("\nEnter a value: "); 
scanf(" *1f", &value); /* Read a value */ 
++i; /* Increment */ 
sum += value; /* Add current input to total */ 


printf£("\nDo you want to enter another value ( enter n to end )?"); 
scanf(" %c", &indicator); /* Read indicator */ 
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Program Analysis 


For the same input, this version of the program will produce exactly the 
same output as before. It was necessary to initialize indicator with a ‘y’ 
in place of an ‘n’, otherwise the while loop would terminate immediately. 
As long as the condition in the while is True, the loop continues. 


It would be better if the loop condition were extended to allow `y’ to be 
entered, as well as ‘y’. At the moment, the loop will end if you enter “Y” 
- it’s easy to overlook such simple limitations. Modifying the while loop 
like this would do the trick: 





The Infinite while Loop 


You can also create an infinite while loop by using a condition that's 
always True: 





Naturally, the same requirements that applied to the infinite for loop, 
should be adopted here as well - in other words, there must be some way 
to exit within the loop. 


The do-while Loop 


The do-while loop is similar to the while loop in that it continues as long 
as the specified expression remains True. The main difference is that the 
condition is checked at the end of the do-while loop, not at the beginning. 
Thus the loop statement is always executed at least once. 
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do-while 


Loop 





Syntax of the do-while Loop 


The general form of the do-while loop is: 


do 
{ 


)while( expression ); 


The logic of this form of loop is shown in this illustration: 


do 
{ 

LoopStatement; 
}while(Expression); 
NextStatement; 





A do-while Example 


We could replace the while loop in the last version of the program to 
calculate an average with a do-while loop: 


{ 
printf("\nEnter a value: "); 
scanf(" $1f", &value) ; /* Read a value */ 
++i; /* Increment */ 
sum += value; /* Add current input to total */ 


printf("\nDo you want to enter another value ( enter n to end )?"); 
scanf(" $c", &indicator); /* Read indicator */ 


Jwhile( (indicator=='y') || (indicator==! 
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There is little to choose between them except that, for correct operation, this 
version doesn't depend on the initial value set. As long as you want to 
enter at least one value - which isn't unreasonable for the calculation in 
question - then this version is preferable. 






Notice the semi-colon after the while statement in a do-while 
loop. There isn't one in the while loop. 


The do-while loop is rarely used compared with the other two forms. Keep 
it in the back of your mind though, because when you need a loop that 
executes at least once, it delivers the goods. 


Summary 


In this chapter we've assembled all of the essential mechanisms for making 
decisions in C, and we've also gone through all the facilities for repeating a 
group of statements. The essentials of what we have discussed are: 


The basic decision-making capability is based on the set of relational 
operators, which allow expressions to be tested and compared, 
yielding a value of True or False. 


When a condition is tested, True is normally represented by 1, 
although any non-zero integer will also be interpreted as True, and 
False is represented by 0. 


The decision-making capability in C is provided by the if statement, 
the switch statement and the conditional operator. 


There are three basic methods provided for repeating a group of 
statements. They are: 


The for loop which allows the loop to repeat a given number of 
times. 


The while loop which allows a loop to continue as long as a 
specified condition is True. 


The do-while loop which executes the loop at least once and allows 
continuation of the loop as long as a specified condition is True. 


Summary 








"WES The continue keyword allows you to skip the remainder of the 


current iteration in a loop and go straight to the next iteration. 


@ The keyword break provides an immediate exit from a loop, and an 


exit from a switch at the end of a group of statements for a given 
case value. 


Programming Exercises 


1 


Write a program to compute the maximum and minimum of an 
arbitrary number of values entered from the keyboard. The program 
should allow more than one sequence of input to be entered. Use an 
input prompt to control whether more data is to be entered and when 
the program is to end. 


Write a program to create and display a multiplication table for two 
ranges of values entered and defined from the keyboard. Make the 
range with the least number of values run across the screen in a row. 


Write a program to display all the ASCII characters in a 16x16 table, 
which has the first hexadecimal digit of a given character as a row 
label, and the second hexadecimal digit as a column label. 


A prime number is an integer that is only exactly divisible by 1 and 
itself. Write a program to compute all the primes less than or equal to 
an integer entered from the keyboard. 


(Hint: for each number greater than 2 and less than the 
number entered, check that it isn’t even. Also check that 
it isn’t divisible by any odd number greater than 1, and 
less than or equal to the number being tested.) 
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In this chapter we will look at how collections of data can be created and 
manipulated. The methods involved here for dealing with data indirectly are 
very characteristic of C, and are a major reason why the language is so 
powerful. By the end of this chapter you will understand: 


What arrays are, how they are created and used. 

How to use arrays for holding and processing character strings. 
How to declare and use multi-dimensional arrays. 

What the operator sizeof is used for. 

What a pointer is and how it works. 

The relationship between pointers and arrays. 


How to process strings using pointers. 


How you can allocate and use memory during the execution of your 
program. 
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Collections of Data Values 


Each type of variable that we've used up until now has contained only a 
single item of information (incidentally, such variables are often referred to 
as scalar variables). The most obvious extension necessary to handle 
applications of a broader scope would be the ability to reference several 
data elements of a particular type by using a single variable name. 


A Simple Scenario 


Suppose you had a fanatical interest in the weather and wanted to record 
the rainfall, together with maximum and minimum temperatures, on a daily 
basis throughout the year. Since there are a maximum of 366 days in a year, 
this would involve a maximum of 366 values for each of the three types of 
data. So we could record these three types of data under the names 
TempMax, TempMin and Rainfall. 


Now, in order to feed your computer with this information and to form 
some sort of opinion on it, you want to be able to reference any set of the 
data items by their generic name. You also want to be able to select a 
particular member of each set by some means, the rainfall figure for day 62, 
for example. 


Arrays 


The mechanism in C to do all of this, and more, is called an array. An 
array is simply a block of several contiguous memory locations, each of 
which can store an item of data of a given type, say int or double, and be 
referenced through a common variable name. Each of the recorded data 
values of rainfall can be stored in a single array, which we have agreed to 
name Rainfall. 


Declaring an Array 
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To declare an array with the name Rainfall to store values of type float, 
you would use the declaration statement: 
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A 


Index 


Values 





The size of the array appears between square brackets immediately after the 
array name, so here we've declared the array Rainfall[] as being able to 
store 366 separate values, which are normally referred to as elements of the 
array. When referring to an array in the text, we will include the square 
brackets to make it clear that we're dealing with an array, and not a 
variable or a function. 


Since a single float value normally occupies 4 bytes and we have 366 
values in the array, the total memory occupied by the array Rainfall[] is 
a significant amount - 1464 bytes. 


Index Values 


Individual elements in an array are referenced by an index value. The first 

has the sequence number 0, the second 1 and so on. Alternatively, you can 
envisage the index value as an offset from the first element in the array, so 
that the second element is offset by 1 from the first, the second element is 
offset by 2 from the first and so on. 


Remember, the last element always has an index of one less 


than the number of elements. 





An Example of Index Values 


Assuming that you're going to store the rainfall figures in sequence, the 
index value to access the value for day 62 would be 61, and you would 
reference this element with the expression Rainfall[61]. Due to a leaking 
fire hydrant, which always leaks on this day, you need to apply a correction 
factor of six inches. A C program statement to do this would be: 





This demonstrates that you can use an array element reference just like a 
normal variable. 


Using Index Values 


An integer was used as the index value, when we applied the hydrant 
correction factor, but we can use any expression that results in a valid index 
for the array concerned. If you know that your equipment for recording 
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rainfall consistently produces values that are in error and that they should 
be 90% of the values recorded, you can correct all the values for the year 
with a loop: 





It’s a very compact piece of code for processing 366 elements, isn't it? This 
loop will multiply each element of the array Rainfall[] by 0.9, starting 
with the element referenced by an index value of 0, up to the element 
referenced by 365. 


Arrays in Memory 
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It can sometimes be useful to understand how an array is laid out in 
memory. Assuming that the array starts at address 0x1000 in memory, the 
basic structure of an array is: 


Rainfall[O] Rainfall[1] Rainfall[2] Rainfall[3] 





1000 1004 1008 100C M e BU 
Memory Addresses cae Lage 


a suet Rainfall[365] 


- - 
- r 
- - 


15B4 


float Rainfall[366]; 


Arrays are usually stored in memory in ascending address order, so 
successive elements will be at regular increments of 4 bytes (since each 
float value occupies 4 bytes). 








Initializing 








Arrays 





It's your responsibility to make sure that the index values in your program 
stay within the valid range for your array. Attempts to reference memory 
locations outside of an array will cause an error on machines with memory 
protection in operation, or worse, let your program continue for a while, 
only to trip you up later on. 


Initializing Arrays 


Arrays can be initialized when they’re declared, just like scalar variables. In 
the case of an array, initial values are given in braces, like this: 





Each value is assigned to an individual array element, starting from element 
zero, so that tempMax[0] is ‘50’, tempMax[2] is ‘52’ and so on. 


Be careful that you don’t include more values than there are elements in the 
array! Of course, you can have fewer values, and those not given values 
will be initialized with zeros. Note that array elements will only be set to 
zero if there’s some sort of initialization list present. If there isn’t, then 
garbage values are likely to be present and liable to cause serious problems. 


An Example of Arrays 


Let's suppose that you're able to generate rainfall by firing a 300 pound 
package containing a secret compound into the air with a giant catapult you 
have erected in your yard. This only works, however, when the temperature 
has been below 70 degrees Fahrenheit for the past month, and you only 
want to risk trying it (because of the unscientific, abusive, unreasonable and 
unpleasant reaction of neighbors) when the mean rainfall is below the 
annual average for two successive months. All you need, to be the hero of 
the weather scene, is a program which will tell you when to fire the 
catapult. The following program will test out the methodology with data for 
a whole year: 
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Program Analysis 
This program works extremely well, producing the output: 


Fire the catapult for month 2! 
Fire the catapult for month 6! 
Fire the catapult for month 7! 
Fire the catapult for month 12! 


This means that you would have had four rainmaking opportunities in the 
year. In all probability, it also means that the neighbors will insist that you 
dismantle the catapult and take dancing lessons, but let's see how it works 


anyway. 


We have something new in the line of code: 
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 Wdefine SAM 





LES 12  .  /* Number of data samples */ m 


This is a pre-processor command that will be processed before the program 
is compiled. It defines the string saMPLES as being equivalent to 12, so 
wherever SAMPLES appears, it'll be replaced by 12. The #define command 
enables you to avoid the use of explicit numbers by enabling you to use a 
more meaningful mnemonic instead. There are quite a few more pre- 
processor commands, and we will be looking at these in Chapter 9. 


Both arrays, TempMax[] and Rainfall[], are declared with SAMPLES number 
of elements, and their values are included in an initializing list for each. 


Because we're dealing with pairs of months, corresponding to index values 
i and i+1, the for loop index runs from 0 to SAMPLES-1: 


 for( i»0 ; A«SAMPLES-1 ; de.) — us Qo e 


The average of two successive months is stored in the variable 
BiMonthRain, and this is used in the following if statement to determine 
when the rainmaking should take place: 








The if expression uses a logical && to combine the conditions that 
BiMonthRain should be less than the annual average, and the maximum 
temperature for that month should be less than the critical temperature. The 
message will only be output by the print£() function when both conditions 
are fulfilled. 


You could avoid the need for the variable BiMonthRain by substituting the 
expression actually in the if statement, but it makes the program a little 
less clear, and hardly seems worth it just to save 4 bytes. 


Character Arrays and String Handling 


An array of type char is called a character array and is generally used to 
store a string. A character array is handled a little differently from an array 
of numeric values, since you frequently want to treat it as a single entity, 
whereas a numeric array is almost always a set of distinct values. 
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A character array of a given number of elements may be used to store 
strings of different lengths at different times. For these reasons, a character 
string in C is a sequence of characters, with a special character appended to 
indicate the end of the string. The string-terminating character is defined by 
the escape sequence ‘\0’ and is referred to as a null character. The 
representation of a string in memory is shown here:. 





1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 100A 


ASCII Code 


The diagram assumes that the string is stored at memory location 0x1000. It 
also shows the ASCII code of each character in the string as a hexadecimal 
value. Each character occupies one byte, so, together with the null character, 
a string requires a number of bytes that is one greater than the number of 

characters it is composed of. In this case the string occupies 11 bytes. 


We can declare a character array and initialize it with a string constant 
between quotes: | 





Note that the terminating ‘\0’ is supplied automatically by the compiler, so 
there is no need to include one yourself. 






Remember, you must declare the array one bigger than the 
number of characters you want to store, allowing room for the 
computer to automatically add wo. 






You can even let the compiler work out the length of an initialized array 
for you. Consider the following declaration: 








Because the dimension is unspecified, the compiler will allocate enough 
space to hold the initializing string plus the terminating null, in this case 16 
elements. Of course, if you want to use this array for storing a different 
string later, the length of the new string must not exceed 16 bytes. 
Generally, it's your responsibility to ensure that the array is large enough 
for any string you might subsequently want to store. 


String Input 


So far, we've used the function scanf() for reading all our input from the 
keyboard. We can also use it to read in a character string. It supports the 
format specifier, *s in order to do this, but has one rather serious limitation 
- it won't read a string containing blanks, and any whitespace character 
signals the end of input. If we tried to read the name of the physicist 
Richard Feynman using scanf(), we would only get ‘Richard’. 


The gets Function 


Fortunately, the header file sTDro.H contains definitions of a number of 
other functions for reading characters from the keyboard. The one that we 
shall look at here is the function gets(), which reads a string into a 
character array and is typically used with statements such as: 





These statements first declare a char array Name[] with 80 elements, and 
then read characters from stdin, normally the keyboard. 


ies are read from stdin until the ‘\n’ (newline) character is read - 
the ‘\n’ character is generated when you press the Return key. After the 
input string is stored into memory, a ‘\0’ replaces the newline character. 


A String Reading Example 


We now have enough knowledge to write a simple program to read a string 
and then count how many characters it contains: 
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Program Analysis 


The while loop continues as long as the current character referenced by 
buffer [count] isn't ‘\0’. This sort of checking on the current character 
while stepping through an array is a common technique in C, although 
there are better ways of performing this particular operation (primarily via 
library functions). The only action in the loop is to increment count for 
each non-null character entered. 


Finally, let's note a couple of points about displaying the string and 
character count. The string output uses the specifier %s to output string 
characters until a ‘\o’ character is found. If we only wanted to output the 
first 10 characters of the string and ignore the rest, we could use a width 
specification after the % symbol, like %10s. 










In order to output a double quote character in the printf() 
format string argument, we need to use the escape character 
v^, because a bare quote character would instead signal the 
end of the format string. 


There is a serious flaw in this program: nothing prevents you from entering 
more than 79 characters. This would cause serious problems, not least that a 
terminating null would be missing. How can we protect against this? 
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Managing String Input 


In order to gain control of the situation, we can deal with string input by 
reading one character at a time, where we can check that we don't exceed 
the capacity of the intended array. Another function, getchar(), is available 
from the standard library, which will enable us to do this: 


/* EX4-03.C String input with some security */ 
#include <stdio.h> 


#define MAX 80 /* Maximum buffer size */ 


int main() 


{ 


char buffer[MAX]; | | s /* Input buffer a 
char ch; OA Ble dobar store */ 
int count = 0; /* Character count */ 


prince AnEnter a e. less than a ee ae PAN): 


/* Read characters | from the keyboard. ye “7 


while( 
| Uch=getohar ()) a. cnt n ye while the nidebter isn't MA 
p m“ end-of-line and there's */ 
&& Tamane < xa) d 2 . /* enough space left to include y 
Lo. no pM the NULL on the end, */ 


buffer [count++] = eh a |. /* add it to the buffer and */ 
mo T tad increment the counter. */ 


/* Finished renli > miy bas the loop finished? */ 
/* If it was because the soning finished, ada the null terminator ^ 
1r: oh we "at ) — | | Ru 

_ buffer [count] = no 


i* But if we ran eut of space, print an error message "u 


else 
{ 
printf("WnToo | many characters. Program aborted. "); 
return 9; a 
) 
while( buffer[count] != '\0O' ) /* Increment count as long as the */ 
count++; /* current character is not null */ 
printf ("The string \"%s\" has $d characters", buffer, count); 


return 0; 
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Program Analysis 


This looks like quite a long piece of code, but remember, since we're 
reading one character at a time, we have to look out for the end of the 
string, as well as check that the number of characters doesn't exceed the 
capacity of the array. 


The maximum legal index value for the array is MAX-1, so the second 
condition, (count«MAX-2), makes sure that the maximum index used to 
store a character is one less than this value, so as to allow space for the 
10” at the end. 


When the loop ends, it may be because ‘\n’ has been read, or because 
count is equal to MAX-2, or both. The if statement following the loop: 





tests the variable ch for ‘\n’, and if it's encountered then *10* is stored in 
the current free position in buffer[] and marks the end of the string; there 
will always be at least one free element in buffer[] if ‘\n’ is found. 


If ch doesn't contain ‘\n’, then it means that the loop must have ended 
because count is equal to MAX-2, indicating that there's only one free 
element in buffer[]. Since we need at least two - for the current character 
and for ‘\0’ - there's insufficient space, so the program ends with an error 
message. 


Multi-dimensional Arrays 


The arrays we have defined so far, with one index for accessing elements, 
are referred to as one-dimensional arrays, or vectors. An array can also have 
more than one index value though. If an array has two index values it is 
called a two-dimensional array, and so on. 


A Multi-dimensional Example 


Suppose our enthusiasm for nature and the weather extends into farming, 
and that we have a field where we grow bean plants in rows of ten. We 
could declare an array to record the weight of beans produced by each 
plant with the statement: 
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This declares the two-dimensional array beans, the first index being the row 
number, and the second the number within the row. An equivalent way of 
envisaging this is as an array of 12 objects, each of which has an array of 
10 double elements. Referring to any particular element requires two 
indices, each enclosed in their own pair of square brackets. For example, we 
could set the value of the element reflecting the performance of the fifth 
plant in the third row with the statement: 





With our meteorological skills to help us, we are successful bean farmers so 
we can add several identical fields. Assuming we have five fields cultivated, 
we could use a three-dimensional array like this: 





If we ever get to bean farming on an international scale we'll be able to use 
a four-dimensional array, with the extra dimension designating the country. 
Producing this sort of quantity of beans for human consumption, however, 
may start to damage the ozone layer. 


Multi-dimensional Arrays in Memory 


Arrays are stored in memory such that the rightmost index value varies 
most rapidly. You can visualize the array data[31[4] as 3 one-dimensional 
arrays, with 4 elements each: 









1000 1004 1008 100C i 
data[O][0] data[0][1] data[O][2] data[0][3] 


101C 
data[1][3] 


1018 
data[1][2] 











data[1][0] data[1][1] 





1028 102C 
pemean data[2][1] data[2][2] data[2][3] 


long data[3][4]; 
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The diagram also shows the memory address of each element as a 
hexadecimal value, assuming that the array is stored at location 1000. 


Initialization of a Multi-dimensional Array 


To initialize a multi-dimensional array, you use an extension of the method 
used for a one-dimensional array. For example, you can initialize a two- 
dimensional array, data[], with the declaration: 





Thus the initializing values for each row are contained within their own 
pair of braces. 


Incomplete Rows 


When the values in any row are exhausted, the remaining array elements 
will be assigned zero. For example, with the declaration: 





the elements data[0][3], data[1][2] and data[1][3] have no initializing 
values and will therefore be initialized as zero. If you wanted to initialize 
the whole array with zeros, you could simply write: 





If you are initializing numeric arrays with even more dimensions, then 
remember that the braces are nested to the same number of levels as there 
are dimensions in the array. 
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Storing Multiple Strings 


You can use a two-dimensional array of type char to store multiple strings. 
The first dimension defines the number of strings in the array, and the 
second the maximum number of characters in each string. So the 
declaration: 


provides storage for 5 strings, with up to 50 characters in each string. 


Improving Our Output 


We could use this to improve the output for our first example in this 
chapter: 


/* EX4-04.C Determining which month to make it rain */ 

#include <stdio.h> 

#define SAMPLES 12 /* Number of data samples */ 
int main() 

{ 


const int CRITICALT = 70; /* Critical temperature */ 


int TempMax[SAMPLES] = ( 50,45,52,60,66,69,67,84,85,71,67,53 ); 


float Rainfall [SAMPLES] = ( 1.2f, 2.4£, 6.9f, 4.1£, 2.1£, 2.3£, 
Ü.2t, L.BHf. S.J7E., 3.38; 2.9f, Lee b 





float RainMean = 0.O0f; /* Average rainfall */ 
float BiMonthRain = 0.0f; /* Mean of two months rain */ 
int i = 0; /* Loop counter */ 


/* Calculate the average rainfall */ 
for( i20 ; i«SAMPLES ; i++) 

RainMean += Rainfall[il; /* Sum total rainfall and */ 
RainMean /= SAMPLES; /* divide by number of samples */ 
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for( i=0 ; i«SAMPLES-1 ; i++) 
( 
BiMonthRain = (Rainfall[i] + Rainfall[i+1])/2.0€; 
if( (BiMonthRain<RainMean) && (TempMax[i+1]<CRITICALT) ) 


) 


return 0; 


) 


Program Analysis 


The lone output statement possibly contains the only modification difficult 
to understand: 





We just index the first dimension of the array Month[] with i+1 (not i+2, 
which we used to get the month number starting with January as 1), and 
use that as the output argument. We don't need to specify the second index 
to the array Month[], since we're referring to a complete string. A reference 
such as Month[1][2] refers to a single letter - in this case the letter ‘b’. 


The program is now more explicit in its recommendations: 


Fire the catapult in February! 
Fire the catapult in June! 

Fire the catapult in July! 

Fire the catapult in December! 


The sizeof Operator 
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The sizeof operator comes in very handy when working with arrays. It's 
applied to a single operand and so is a unary operator. It returns the size 
of any object in bytes as an unsigned integer. If we've declared a variable 
number Of type long, then the expression: 


sizeof number 


will have the value 4, since the variable number occupies 4 bytes. 
Alternatively, you could use parentheses. 


CS AS ERREUR OE TE OE E OL V er ARS RETO ees aed Ses y DEDERE A lo LY Me Cha eee TPO eee eT ee er ee 






is D VS bi gions cM TUS 


DERROTA 





sizeof 


Operator 





Generic Sizes 


You can also obtain the size of a generic type, by placing the type name as 
the operand of sizeof. So we could use the statement: 


intSize = sizeof( int ); 


to record in the variable intsize, how many bytes are occupied by an 
integer. Note that when you are using sizeof with a data type, you must 
use parentheses. 


Using sizeof to Aid Array Flexibility 


If we declare an array, such as: 
long numbers[][3] = ( (20, 30, 40}, {40, 50, 60} }; 


from time to time you may well want to be able to add or delete rows, just 
by adding or deleting some of the values that initialize it. Ideally, you don’t 
want to be ferreting around the program, looking for all the places where 
you've fixed the number of rows - the sizeof operator can help you with 
this: 


NumRows = sizeof numbers/sizeof. numbers [0]; 


The expression sizeof numbers will generate the number of bytes in the 
whole array, whilst the expression sizeof numbers[0] produces the 
number of bytes in one row of the array. Dividing the first by the second 
produces the number of rows in the array, so you can use the variable 
NumRows to control loop counts and row indexing of the array. If you add 
rows to the array, the value of NumRows will adjust automatically. 


If you need to know how many elements the array has, you could use the 
statement: 


NumElmnts = sizeof numbers/sizeof numbers [0] [0]; 


Dividing the number of bytes in the array by the number of bytes in a 
single element will give you the number of elements in the array. 
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Indirectly Accessing Data 


The variables we have dealt with so far provide you with the ability to 
name a memory location where you can store data of a particular type. The 
contents of a variable are either entered from an external source, such as 
the keyboard, or calculated from other values. There's another kind of 
variable in C which doesn't store data that you enter or calculate, but 
greatly extends the power and flexibility of your programs. This kind of 
variable is called a pointer. 


What is a Pointer? 


Each memory location that you use to store a data value has an address, 
which provides the means by which your computer references a particular 
data item. A pointer is a special kind of variable that can store the address 
of another variable. They have names just like any other variable, and they 
have an associated type which designates what kind of variables its contents 
can refer to. So, a pointer of type double can only store an address of a 
variable of type double. 


Of course, since a pointer is a variable, it can store the addresses of 
different variables of a given type at different times during the execution of 
your program. 


Declaring Pointers 
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The declaration for a pointer is similar to that of an ordinary variable, 
except that the pointer name has an asterisk in front of it. For example, to 
declare a pointer pnumber of type long, you could use the statement: 





The asterisk identifies it as a pointer, and the type is read as a ‘pointer to 
long’. We will use the prefix p for pointer names, in order to distinguish 
them from other variables and make them more readily recognizable as 
pointers. 


You can mix declarations of ordinary variables and pointers in the same 
statement. For example: 


guess vant 


AAA 


area 


SHUTS hE FREES IN MUL TA 


ADAN TA Der TEIN EN 


RRA PO FRESE TA ERAT RTT, 


ere ee 





This declares both the variable pnumber of type ‘pointer to long’ as before, 
and also the variable number. 


Lets take an example to see how a pointer works without worrying about 
what it's for at this stage. Suppose we have the above 1ong integer variable 
number, containing the value 99. We also have the pointer, pnumber, of type 
long which we could use to store the address of our variable number. But 
how can we obtain the address of a variable? 


The Address Operator 


What we need is a new operator, the address operator, &, which we first 
met when we used scanf() to get input. To set up the pointer we could 
write the assignment statement: 


number in pnum */  — 





You can use the address operator to obtain the address of any variable, but 
you need a pointer of the same type to store it (there is one exception, — 
which we will see later). For example, if you want to store the address of a 
double variable, then the pointer must have been declared as type 'pointer 

to double’. 


Using Pointers 


The 


Taking the address of a variable and storing it in a pointer is tremendous 
fun, but the really interesting question is, what can you actually do with it? 
Fundamental to making a pointer useful, is the mechanism for accessing the 
data value in the variable to which it points. This is done using the 
indirection operator. 


Indirection Operator 


The indirection operator, *, is used with a pointer variable to access the 
contents of the variable pointed to. The name 'indirection operator’ stems 
from the fact that the data is accessed indirectly. It is also called the 
dereference operator, and the process of accessing the data in the variable 
pointed to via a pointer is termed ‘dereferencing’ the pointer. 
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One aspect of this operator that can sometimes be confusing is that we now 
have several different uses for the asterisk symbol - it's the multiply 
operator, the indirection operator and is also used in the declaration of a 
pointer. Fortunately, the compiler is able to distinguish the meaning by its 
context. When you multiply two variables, A*B for instance, there is no 
meaningful interpretation of this expression for anything other than a 
multiplication operation. Each context has a unique interpretation, so if you 
have an example that doesn’t immediately identify the context, then there’s 
something wrong with your code. 


Initialization 


Using pointers that haven't been initialized is extremely hazardous - an 
uninitialized pointer can point to anywhere in memory. You could use it 
accidentally to write to areas of memory that currently hold your operating 
system, and change the contents with disastrous results. 


To initialize a pointer to the address of a variable that's already been 
declared is very easy. For example, to initialize the pointer pnum with the 
address of the variable number, you just use the operator & with the 
variable name: 





When initializing a pointer with another variable, the variable must have 
already been declared prior to the pointer declaration. 


The Null Pointer 
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Of course, you may not want to initialize a pointer with the address of a 
specific variable when you declare it. In such cases you can initialize it with 
the pointer equivalent of zero, guaranteed to not point to anything. The 
standard header file srbro.H defines the mnemonic NULL for this, so you 
can declare and initialize it with the statement: 





A pointer with the value NULL is commonly called a ‘null pointer’. 





Null Pointers 





Using Null Pointers 


Using NULL to initialize a pointer ensures that the pointer doesn’t contain a 
valid address and makes it clear that this is a pointer being initialized with 
zero. It also provides the pointer with a particular value that you can 
distinguish in expressions. For example: 








This checks whether the pointer pnum contains a valid address, and if it 
doesn’t a message is displayed. You could just as easily use the following 
statement, which achieves the same thing: 





Attempting to store a value in a null pointer will usually result in an error 
message, although execution won’t necessarily stop at the point where the 
error occurred. 


String Pointers 


A pointer of type char has the interesting property that it can be initialized 
with a string constant. For example, we can declare and initialize such a 
pointer with the statement: 





This looks very similar to initializing a char array, but differs in that it will 
create a pointer to variables of type char, which is initialized with the 
address of the constant array containing the characters “The higher the 
fewer.” The address of the array will be stored in the pointer pproverb. If 
you were to just declare an array: 





this would then produce a somewhat different result. Here only the 
minimum memory necessary to accommodate the string is allocated: 
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char proverb[] = "The higher the fewer."; 


proverb 





pproverb 





char *pproverb = "The higher the fewer."; 


This shows the extra memory allocated for the pointer pproverb, which will 
contain the address of the beginning of the string, as opposed to the char 
array declaration which just allocates the space for the array proverb[]. 


Arrays of Pointers 


We can declare an array of pointers in the same way that we declare a 
normal array. 


An Example of Pointer Arrays 


We could use a pointer array to manage the month names in the previous 
example. Only one statement in the program needs to be changed - you just 


need to replace the declaration of the Month[] array with the following 
declaration: 
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This declares Month[] as a one-dimensional array of pointers, where each 
element is initialized with the address of a constant string. The program 
will run exactly the same as before, so what's the difference? 


The original two-dimensional char array had a fixed row length of ten 
characters. Here we have 12 independent string constants, each of which 
occupies the minimum amount of space - the length of the string plus one 
byte for the ‘\0’ character, plus an array of twelve pointers containing the 
addresses of the string constants. Using the array of pointers, the strings are 
stored without any wasted space. Take the following example: 


char Names[][15] = { "Nebuchadnezzar", 
"Ned" p 





char *Names[] = { "Nebuchadnezzar", 
"Ned" n 


Whether memory is saved by using a pointer array instead of the char 
array will depend on how variable the string lengths are. In this case the 
char array at the top occupies 30 bytes, whereas the definition using a 
pointer array at the bottom only requires 27 bytes, allowing 4 bytes for each 
pointer. 
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The memory occupied by a pointer will vary from one type 
of machine to the next and even between different 
circumstances on the same machine, but it isn't usually more 
than 4 bytes. Of course, if all the strings are the same length, 
then the pointer array will require more memory, but this isn't 
the typical situation. 








There's also another difference between these two definitions. The char 
array declaration allocates a fixed block of memory which has been 
initialized with the two-name strings. You cannot change the memory area 
to which the array Names[] refers, but you are free to change the contents 
to different strings within your program in any way that you wish. On the 
other hand, using the pointer-based approach you can change the addresses 
in the pointer array, but you can't legally change the strings they point to. 


Pointers and Arrays 


Arrays and pointers work in a surprisingly similar way, and you can 
normally interchange array or pointer notation. You can use array names in 
your programs as though they were pointers (with certain limitations), and 
you can also use an index value with a pointer. In most circumstances, if 
you use the name of a one-dimensional array by itself, it’s automatically 
converted into a pointer to the first element of the array. The exceptions are 
if the array name is the operand of the address operator, &, or of the 
operator sizeof. 


A Example of Pointers and Arrays 
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If we have the declarations: 





then we can write the assignment: 





This assigns the address of the array data to the pointer pdata. If we use 
the array name data with an index value, then it defines the contents of 
the element corresponding to that index value. So, if we want to store the 
address of that element in the pointer, we have to use the address operator: 


Pointer 


Arithmetic 








Here, pdata contains the address of the second element of the array. 


Pointer Arithmetic 


Pointer arithmetic implicitly assumes that the pointer points to an array and 
that the arithmetical operation is on the address contained in the pointer. 
You are limited to addition and subtraction, but you can also perform 
comparisons. 


A Simple Arithmetic Example 


For example, the pointer pdata could be assigned the address of the third 
element of the array data with the statement: 


In this case, the expression pdata+1 would refer to the fourth element (the 
address of data[3]). This is very important - pointer arithmetic isn’t simple 
arithmetic. It always operates in units determined by the type of the 
involved pointer, so our expression doesn’t add 1 to the address value in 
pdata, it adds enough to make it point to the next element of type double. 
We could make the pointer point to this element by writing the statement: 





paata += 49... 5. 7/7 Thorement pdata to the next element */ . 

This is different from the expression pdata+1 in that the expression doesn't 
change the value of the pointer, but the assignment does. Here, the address 
contained in pdata has been incremented by the number of bytes occupied 

by one element of the array data. 


Using Arithmetic 


In general, the expression pdata+n, where n can be any expression resulting 
in an integer, will refer to an address offset by n*sizeof(double) bytes 
from the pointer pdata, since it was declared to be of type pointer to 
double. This is illustrated here: 
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data[O] data[2] 


data[1] data[3] 





pdata+1 pdata+3 


double data[10]; 
double pdata; 
pdata = &data[3]; 


In other words, incrementing or decrementing a pointer works in terms of 
the type of the object pointed to. The change is in terms of a number of 
elements of the given type. The most common notation for incrementing a 
pointer is using the increment operator. For example: 





This is equivalent to, and more usual than, the += form. However, the += 
form was used in the earlier statement just to make it clear that the 
increment value is actually specified as 1, whereas the effect is usually 
otherwise, except in the case of a pointer to char. 


The address resulting from an arithmetical operation on a pointer can be 
anything from a value representing the address of the first element of the 
array, to the address which is one beyond the last element. Outside of these 
limits however, the behavior of the pointer is generally undefined. 


Dereferencing 


You can of course dereference a pointer that you have performed arithmetic 
on. There wouldn't be much point to it otherwise. For example, assuming 
that pdata is still pointing to data[3], the statement: 
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is equivalent to: 
data[4] = data[6]; 


The parentheses are necessary when you want to dereference a pointer after 
incrementing the address it contains, since the precedence of the indirection 

operator is higher than that of the arithmetic operators, + or -. If you write 
the expression *pdata+1 instead of *(pdata+1), then this would add one to 
that value stored at the address contained in pdata, which is equivalent to 

executing data[3]+1. Since the result of this expression isn't an address, its 
use in the assignment statement above would cause the compiler to generate 
an error message. The difference between these two expressions is illustrated 
in the diagram here: 


data[O] data[2] 
data[1] | data[3] 


| 
x. 








pdata points 





to here 
pdata+1 e 
points to here is 
* pdata refers to the contents of this ` *(pdata+1) simply refers to 
location so *pdata+1 is an expression the contents of this location 
defining a calculation which will increment and can thus be used on the 
the contents of this location. It is an error right of an assignment. 


to use this on the right of an assignment. 


Using Array Names 


We can use an array name for operations on its elements as though it were 
a pointer. So we can also write the last statement as: 


*(data+4) = *(data*6); | | /* The same as data[4]=data[6]; id 
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This kind of notation can generally be applied so that the corresponding 
elements data[0], data[1], data[2], etc. can be written as *data, 
*(data+1), *(data+2), etc. 


An Example of Array Naming 


We could exercise this aspect of array addressing with a program to 
calculate prime numbers: 





Array Naming 











Program Analysis 


The primes array which stores the results is seeded with the first three 
primes: 


long primes[MAX] = ( 2L,3L,5L }; /* First three primes defined */ 


All the work is done in the do-while loop, which continues until max 
primes have been found. 


The algorithm is very simple and based on the fact that if a number isn’t 
prime, then it must be divisible by one of the primes found so far, all of 
which are less than the number in question (in fact, only division by primes 
less than the square root of the number in question needs to be checked, so 
this example isn’t as efficient as it might be). 


Dividing the value to be tested by each of the known primes is done in the 
nested for loop. Since the value to be tested, held in the variable trial, is 
always odd, we don’t need to test for division by 2. The for loop only 
contains the if statement: 


if(found = (( trial % *(prines+1)) om 0) ) 
break; ./* it's not a pria so exit the fax Sake e) 


This first divides the value in trial by the current prime, *(primes+i) 
(equivalent to primes[i]). If the result is zero, then this signifies that the 
division is exact, so the specified number cannot be a prime. In this case, 
the variable found will be set to 1, and the break statement is executed to 
end the for loop. If the division isn't exact, then £ound will be set to zero 
and the loop will continue until triai has been divided by all of the 
primes found up until now. 


If the loop ends, it may be due to the break being executed, or possibly 
because all primes have been exercised. Therefore it's necessary to decide 
whether or not the value in trial was prime. This is indicated by the 
value saved in the variable found. If trial does contain a prime, then 
found will be zero and !found will be True, so the statement: 


*(primes+count++) = trial; /* ...80 save it in primes array */ 


will be executed. This stores the new prime number in primes[count], and 
then increments count through the postfix increment operator (++). 
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If you compile and execute this example with Max defined as 50, you 
should get the output shown here: 


2 3 5 T 9 

11 13 17 19 23 
29 31 37 41 43 
47 53 59 61 67 
71 73 79 83 89 
97 101 103 107 109 
113 127 131 137 139 
149 151 157 163 167 
173 179 181 191 193 
197 199 211 223 227 


In this example, pointers provide a very convenient and compact notation 
for programming operations with arrays. 


Handling Strings with Pointers 


Programming operations with character strings are almost invariably done 
with pointers, because pointers tend to provide the most natural way of 
handling strings, with extremely compact but nonetheless readable code. 


An Example of Copying a String with Pointers 


We can illustrate the technique with a simple example of copying a string: 
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Program Analysis 


The complete copying process is accomplished in a single line: 





tes = *pSIn++); 
The loop contains no statements - everything is done by the expression 


controlling the loop. This diagram illustrates the position at the start of the 
loop: 


PSIn Address 











pSin++ will 
w— al point to here 
Stringin[] l | ÑO 
When this is copied 

* pSOut++=* pSin++ the value of the 

The value of the expression is the m praia des 

P so the loop ends 

_— value of the character copied 
a A at i Y m 
StringOut[] Mi | | | | | | AE 
we 

*pSOut 7 pSOut++ will 


point to here 
PSOut | Address 


Here, pSIn and psout point to the first element of the corresponding array. 
The condition within the while loop is an assignment which copies the 
contents of the location pointed to by psrn to the location pointed to by 
psout, that is, from an element of StringIn[] to the corresponding element 
in StringOut[]. After the copy, the addresses stored in both pointers are 
incremented to point to the next element in each array. The value of the 
while loop expression is the value of the character copied. As long as the 
contents of psout after each copy operation aren't zero, the loop continues. 
When the *wX0* character is copied, *psout will be zero, and so the loop 
will end. 
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Note that we need to use Stringout as the argument, since 
psout no longer points to the beginning of the string, because 
the address it contains is modified in the loop. 


void Pointers 


You can declare a pointer to be of type void, with a statement such as: 





This results in a pointer of no particular type, useful when you don't know 
in advance what type of pointer you are going to be dealing with. These 
are primarily used with functions, and particularly with the memory 
allocation functions in the standard library which we shall see later in this 
chapter. We will be discussing functions in general in Chapter 5. 


Note that assigning the address in a pointer to a pointer of a different type, 
requires an explicit typecast, unless the destination pointer is of type void. 
Any type of pointer can be assigned to a pointer of type void, and 
subsequently recovered. Note that even with a typecast, assigning a pointer 


to another of a different type that isn't void can cause problems and should 
be avoided. 


A void Pointers Example 


Here's an example, showing how void pointers can be cast: 
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Pointer Notation with Multi-dimensional 
Arrays 


Some care is needed to keep your mind clear as to what is happening when 
you use pointer notation with multi-dimensional arrays. By way of 
illustration we can use an array beans, with the pointer pbeans declared as: 





The pointer will contain the address of the first element of the array. We 
could quite easily have declared and initialized the pointer as: 





Using a two-dimensional array name with a single subscript returns the 
address of the row of the array defined by the subscript, so pbeans will be 
initialized with the address of the first element of the first row of the array. 
This is the same as the previous declaration. 


Suppose we now initialize it with just beans: 


1 zum Wu S 1 d Sa in Ona qo ROM 





With some compilers you'll get a warning message, because although beans 
will refer to an address, that address will contain the address of beans [0], 
which, as we've just seen, also contains an address. So here we are 
initializing pbeans with an address of an address, or in other words, a 
pointer to a pointer. This is said to be a different level of indirection so our 
declaration is wrong, in spite of the fact that many compilers won't 
complain about it. A correct initialization using just the array name would 
be either of the following: 





so that beans is dereferenced to just an address A pointer of the form 
double ** is a pointer to a pointer and can be used to point to an object 
like “beans”, which is an address containing an address... 
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Referencing 


You can reference each element of the array in three ways: 


E v Using the array name with two index values. 
MD Using the array name in pointer notation. 
"WP Using a separate pointer. 
Therefore the following are equivalent: 
beans[i]I[j]l *(*(beans+i)+3) * (pbeans+4*i+j) 
It's also possible to mix array and pointer notation, such as *(beans[i] +3) 


or (pbeans«4)[j], but this has no obvious advantages and is best 
avoided. 


Dynamic Memory Allocation 


Working with a fixed set of variables in a program can be very restrictive. 
The need often arises to adjust the amount of space available for storing 
different types of variables at execution time, depending on the input data 
for the program. You can create variables dynamically, as your program 
actually executes, by allocating a piece of memory and then accessing it 
through a pointer. The allocation of memory for variables at execution time 
is achieved through functions defined in the standard library, STDLIB.H. 


The Heap 
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In most instances, there is unused memory in your computer when your 
program is executed. In C, this unused memory is usually called the ‘heap’. 


You can allocate space on the heap using four functions defined in STDLIB.H: 


malloc() Allocates a block of memory on the heap with a size 


given in bytes by the integer value passed to the function. 


The function returns the address of the block of memory 
allocated, or NULL if the allocation fails. 
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The Heap 





calloc() This function allocates a block of memory on the heap 
based on the two arguments passed to it. The first 
argument specifies that the number of arguments for 
which memory is required, and the second specifies their 
size (in bytes). 


realloc() This function changes the size of a block of memory that 
has previously been allocated. The first argument is a 
pointer to the block concerned, and the second specifies a 
new size (in bytes) for the block. 


free () This function de-allocates a previously allocated block of 
memory specified by a pointer passed as an argument. 


You can allocate space on the heap for variables in one part of a program 
and then release that space, returning it to the heap and making it available 
for reuse later in the same program. This enables you to efficiently use 
memory, and enables programs to handle much larger problems, involving 
considerably more data than might otherwise be possible The pointer to a 
memory area allocated will be of type void *, which can store the address 
of any kind of variable, but you should cast the pointer to the type of data 
you're going to store. 


The best way of understanding how dynamic memory allocation works is 
by looking at a couple of examples. 


Using malloc() for Dynamic Memory 
Allocation 


Let’s take a very simple example of a program that will read in an 
arbitrary number of strings, and then display them: 







put ov. 
m | nt. funct ions * / 
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Program Analysis 


The idea here is to read a string, find out how long it is (including the “10” 
character) and then, after getting sufficient memory allocated on the heap, 
copy the string into it. This will repeat as long as you want to enter more 
strings and as long as the total number of strings doesn't exceed 
MAXSTRINGS. 


After reading a string in the do-while loop, we find the address 
immediately after the end of the string that was entered, with the loop: 





while (*pSIn++ 





ons Nen E E DUO ais m fá Ws pi 3 3 i Buc à 


The pointer pSIn starts out containing the address of the first character of 
the string. The loop adds one to the address stored in pSIn up to and 
including the point where psIn is pointing to a location that contains ‘\0’. 
Thus psin will end up pointing to the address one beyond the ‘yo’ at the 
end of the string. We then call the library function malloc() to allocate 
memory in the statement: 


NONE TURAE “jasilogipeih-~steingtn)) = Xo SY 


The expression passed as an argument to malloc() is the difference 
between two pointers. For the difference between two pointers to be 
meaningful, they need to point to members of the same array. In this case, 
it defines the number of characters in the string, since StringIn is the 
address of the first character of the input and psSrn contains the address of 
the position one character beyond the end of the input. 


After casting the address returned from the function malloc(), tc the type 
pointer to char, we save it in psOut so that we can use it in the copying 
process. We then check that the pointer returned by malloc() isn't NULL 
with the statements: 


Mosen vc M a YA Verify we got some meats — 
Wo o Qo Je T not - report and exit */ 
_printt ("\nMenory allocation failed - - "debi pacem m " 
: retum MC | cd. aq s | 
h 


If the pointer value returned by malloc() is NULL, the expression in the if 
will be True and the program will end after displaying the message. The 
value returned from the program is handed over to the operating system. 
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By choosing different values for different kinds of problems, 
the value returned can be used to indicate the condition 
causing the program to be terminated. 


If the address we got from malloc() isn't NULL, then we also store it in 

the current free position in the array pString[]. The copying is then done 
using a while loop, but because we've modified the contents of pSIn, it's 
necessary to reset it back to point at the beginning of the array StringIn[]. 


Using calloc() for Dynamic Memory 
Allocation 
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Many programmers prefer to use the calloc() library function rather than 
malloc()for obtaining space in the heap, because apart from the fact that 
calloc() will allocate memory as a multiple of objects of a particular type, 
it will also initialize the memory to zero. We could use calloc() in an 
alternative version of our primes program, which will produce as many 
primes as available memory will allow: 





calloc() 








do 


trial += 2L; p /* Next value for checking */ 
found = 0; | /* Set found indicator */ 


/* Try division by existing primes */ 
for ( iz2U ; i«count ; i++ ) 


{ 
| /* found will be 1 for exact division */ 
/* and if division is exact, its not a prime */ 
if(found = (( trial * *(pPrimes+i)) == 0) ) 
break; /* it's not a prime so exit the for loop */ 
H j x 
if (ifound) . /* We got one... */ 


| * (pPrimes+count++) = trial; /* ..so save it in primes array */ 
)while (count « NumPrimes ); | 


// Output primes 5 to a line 
for( is0U ; i<NumPrimes ; i++) 


( | 
if( (1%50)==00 ) |» s. 4* New line on 1st, and every 5th line */ 
printf(*An*)j | 
printf("*101d", *(pPrimes*i)); 
) 
free (pPrimes) ; /* Release the memory before we go */ 
return 0; 


Program Analysis 


This is very similar to the original, so let’s just look at the changes. The 
primes are now to be stored on the heap, so we've the pointer pPrimes 
which will store the address of the space that is allocated: 


long *pPrimes=NULL; : /* Pointer to primes array */ 
This value is used as the first argument to calloc(), with the second 
argument being sizeof(long), so the function will provide space for 
NumPrimes elements of type long: 


if ((pPrimes=(long*)calloc(NumPrimes, sizeof (long) ) )==NULL) 


Both arguments to calloc() need to be of type unsigned int. 
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The type size t is defined in the standard library as unsigned int, and 
values of this type are returned by the operator sizeof. The variables 
count and NumPrimes, and the loop counter i, have all been declared as 
type unsigned int for consistency and to avoid compiler warnings which 
you may get if they're different. So the maximum number of primes the 
program will generate is limited to the maximum value of an unsigned 
int, although it may also be limited by the maximum value of a type long 
number and the amount of time you are prepared to wait for output. 


Extending a Memory Area 


The reailoc() function allows you to increase the size of a memory block 
that you've already created on the heap. Assuming that you've already 
created an initial area of say 100 bytes, to store a string and its address in 
a pointer pArea, you can increase the area with: 





The new area will be 200 bytes. The first argument to realloc() is a cast 
of type ‘pointer to void’ because, this is the type the function expects. The 
second argument specifies the size of the new area, and its address is 
returned by the function. Any data stored in the original area on the heap 
pointed to by parea will remain, but the additional space won't be 
initialized. The size of a memory area on the heap can also be reduced by 
this function, in which case the data originally stored in the smaller area is 
also retained. 


If the function realloc() can't reallocate the memory, then it will return 
NULL, and in this situation leave the first argument unchanged. If you 
wanted to retain the original memory area in these circumstances, you 
would need to save the address returned from realloc() in a different 
pointer from that containing the address of the original memory area. 


138 








Summary 


We have seen how arrays are declared and used and how they relate to 
pointers. While they have a role in many numerical calculations, pointers are 
by far the most common basis for processing and managing data in C. It's 
therefore most important that you hone your knowledge and skill in the use 
of pointers. They are essential to good programming in C. 


The important points we have discussed in this chapter are: 


Arrays enable you to define a number of elements of one type and 
manage them through a single variable name. Individual elements in 
an array are referenced using one or more index values. 


A pointer is a variable that can store the address of another variable 
or value of a specified type. You can obtain the address of a variable 
using the & operator, and you can use the value stored at the 

location pointed to with a pointer by using the dereference operator *. 


You can perform arithmetic on pointers. You can add or subtract a 
constant integer value, in which case the change in the address is in 
terms of a number of units of the pointer type. You can also subtract 
one pointer from another when they point to elements of the same 
array - the difference will be in terms of the number of elements. 


Pointers and arrays are quite strongly related. You can use array 
notation with a pointer and vice versa, but remember - an array 
name is not a pointer, so you can't modify it. You can only use it in 
expressions to access an element of the array, whereas you can 
modify the address stored in a pointer. 


You can allocate memory for your data using functions provided by 
the standard library. In order to use them you must include the 
header file STDLIB.H in your program. 
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Programming Exercises | 
1 write a program to read a string into an array, and then reverse the 

sequence of characters in the string and display the result You can ; 

exercise the program with palindromes such as: i 

Madam I’m Adam E 

A man a plan a canal Panama | 

Ned I am a maiden 3 
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Write the same program, but this time use pointers. 


Write a program using pointers to compare two strings. Do this by 
making character by character comparisons. The string with the first 
character that has a code value greater than the other, is greater. If 
one string is longer than the other, and the characters in the shorter 
strings are identical to the corresponding characters in the longer 
string, then the longer string is the greater. 


Write a program using pointers to read a string, and to capitalize the 


first letter of each word. Assume the string begins with a word, and 
that each succeeding word is preceded by a space. Don't forget to 
allow for the possibility that the first letter of a word may already be 
a capital. 


Write a program using pointers to count the frequency of different 
letters in a string. 
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Now that we understand the computational aspects of C, and the kinds of 
data that we can deal with, we are going to look into how the components 
of a C program are put together. To successfully produce a program of 
significant complexity, it's essential to be able to break the program up into 
manageable units. Defining these units is the subject of this chapter. By the 
end of this chapter, you will have learnt: 


The concept and structure of a C function. 
What a function prototype is, and why it's necessary. 


How information is passed to a function and how you can get 
results back. 


How pointers are used to transfer information to and from a 
function. 


How static variables can be used within a function. 


What a recursive function is and how recursion works. 


How to process command-line arguments in your program. 
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Functions 


Functions are the basic building blocks for creating a program. All the 
examples we have written have consisted of a single function, main(), but 
as we saw in Chapter 1, a real world C program usually comprises many 
functions. A function is a self-contained block of code with a specific 
purpose. It can have data passed to it and it can return a value. It has a 
unique name to identify it, governed by the same rules as those for a 
variable, and that function name is used to call it for execution. 


Note that a function name only needs to be unique among 
the functions in your program. A variable or a statement label 


can have the same name as a function without interfering with 
it, although using the same name for a variable, label and 
function isn't a particularly good habit. 





Executing a function is referred to as a function call, or calling a function. 
When a function is called, execution transfers to the first statement of the 
function, and on its completion, the function returns control to the calling 
point.\ A function can be called as many times as necessary from different 
points in a program.] Therefore, if you have a computation used several 
times in a program, packaging it into a function will save considerable 
memory space because this will avoid duplicating code. 


The Structure of a Function 
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The first line of a function, specifying its name, is called the function header. 


This is followed by the function's executable code, called the body of the 
function, which is situated between curly braces. The structure of a function 
is illustrated in this diagram: 


ERISA im m 
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Function name Function header 


(double x, int. n 











P T T UU NES CR 
Return type SOM eer Fate — Parameter list 
double result = 11 
if( n«O) tx 
| Return value 


while( | Y — — — —- Function body 


return es Return value 


The diagram also shows the components of the function header, which we'll 
discuss later. All the variable names that are declared within the body of a 
function are local to that function, so you don’t have to worry about 
avoiding unnecessary duplication. 


Program Analysis 


Let’s look at the example shown in the illustration in a little more detail. It 
will raise a value of type double to a given positive integral power, such as 
compute x". Here is a commented version: 
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To understand how this works, let's look at it one part at a time. 


The Function Header 


Typically, we won't include the initial description of functions in our 
examples, in order to avoid overly inflating the page count. You should, 
however, get into the habit of including a description in your examples. 


The return value is returned to the calling function when execution of the 
function is finally completed. The value to be returned is specified within 
the body of the function by a return statement. YAs you can see here, there 
can be more than one return statement in a function. 


When our function is called by using it within an arithmetic expression, the 
double value returned will be used in the expression’s evaluation. Any 
function that has a return type other than void must have a return 
statement specifying the value to be returned. 


Our function has two parameters - the first is x, the value to be raised to a 
given power (of type double) and the second is n, the value of the power 
to which x is to be raised (of type int)L Note that no semi-colon is 
required at the end of the function header. If you include one, then it 
becomes a prototype and the code that follows it becomes erroneous.\ 
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Function Header Syntax 


The general form of a function header can be written as: 
return type FunctionName( parameter list ) 


The return_type can be any legal type. If the function doesn't return a 
value, then the keyword voia should be specified. The void keyword is 
also used to indicate the absence of parameters, so a function that has no 
parameters and doesn't return a value would have a header like this: 


void MyFunction( void ) 


If nothing is specified for the parameter list, then this also an indication 
that the function has no parameters. 


You mustn't use a function with a return type specified as void in an 
expression combining it with other variables or constants in your program. 
Since it doesn't return a value, it can't participate in any calculation defined 
by an expression. You can only use it in a statement by itself. 


The Function Body 


The computation is performed by the statements in the block following the 
function header. This is called the function body. In our example, the first 
statement declares a variable result, initialized with the value 1.0, because 
any number raised to the power 0 is equivalent to 1. The variable result 
is local to the function, as are all variables declared within the function 
bodyA This means that the variable result is automatically created and 
initialized each time the function is called, and ceases to exist after the 
function has completed its execution. For this reason, variables local to a 
function are sometimes called automatic variables. Once execution of the 
function is finished, the memory that the variable result occupies may well 
be used for something else. 


After the declaration of the variable result, the if statement checks whether 
n is negative. If it is, then a value of 0.0 is returned arbitrarily, since the 
function isn't intended to deal with negative values for n. You could easily 
add code here to deal with a negative exponent, but you would need to 
check that x wasn't zero. 
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As we have said, the names of all the variables declared within the body of 
a function are local. This includes the parameter names too. There is 
nothing to prevent you from using the same names for variables in other 
functions. Indeed, this is just as well. It would be extremely difficult to 
ensure that variable names are always unique with a program containing a 
large number of functions, particularly if they weren't all written by the 
same person. 


The return Statement 


The first return statement returns 0.0 if n is negative, and the second 
return statement returns the value of result. The value is returned to the 
point where the function was called. The thought that might immediately 
strike you is that we just said result ceases to exist on completing 
execution of the function - so how is it returned? The answer is that a copy 
is made of the value being returned, and this copy is made available to the 
return point in the program. 


The general form of the return statement is: 
return expression; 


where expression must evaluate to a value of the type specified in the 
function header for the return value. The expression can be any 
expression, as long as you end up with a value of the required type. 


If you've specified the type of return value as void, then there must be no 
expression appearing in any return statement within the function. It must 


simply be written as: 
return; 


With a return type of void, you aren't obliged to include a return 
statement in your function at all - when execution of the body of such a 
function reaches the closing brace, it will automatically return to the calling 
point in your program. 
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Using Functions 


Before you can use a function in a program, you must declare it using a 
statement called a function prototype. This enables the compiler to check that 
the usage of the function is correct. 


Function Prototypes 


A prototype for a function provides the compiler with basic information 
about how the function is used. Jit specifies the parameters to be passed, the 
function name and the type of the return value, essentially the same 
information as the function header, with the addition of a semi-colon. The 
compiler is able to check the types of the arguments passed to a function 
and to verify that they correspond with the types of parameters appearing 
in the prototype. If they don't match, then if possible, the compiler will 
automatically cast the arguments you use to the required type, or else it 
will issue an error message. 


T The prototypes for the functions used in a program must always appear 
before the functions are called, and are usually grouped together at the 
beginning of a program.|The header files we've been including for standard 


library functions include the prototypes of all the functions provided by that 
library. 


| Prototypes are only necessary if a function is used before it is declared; if, 
in your source file, you put all your functions before the main routine, then 
the compiler has all the information it needs and prototypes aren't 


necessary, | However, it takes little effort and is always good practice to put 
them in. 


Function Parameter Naming 


For our power() example we could write the prototype as: 


‘double power( double value, int index ); 
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Note that we've specified different names for the parameters, just to show 
that this is possible. Normally, in the definition of the function, the same 
names are used in the prototype as in the function header, but it doesn't 
have to be so. You can choose the parameter names in the function 
prototype to help you understand exactly what they're used for. 


You can also omit the names altogether if you like, and just write: 





This is just enough for the compiler to do its job, but it's better practice to 
use some descriptive labels in a prototype, and in some cases it can make 
all the difference. If you have a function with two parameters of the same 
type and you omit the names from the prototype, you'll have no 
information about which parameter comes first. 


A Simple Function Example 
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We can exercise the options available with a function, by trying out our 
power() function in an example: 





Using 








Functions 





return 0; ~~ 


JRARAA RNA RAR RA ANA RUANO NAAA RNA RRA RA ARA : | 
* A function which will compute the 2 uo | 
* integral power, n, of a double value, a l 


* x, and return the result as double. uw 
FRENAR PEREAT dede de cabida aa 


double power (| double X, int n T oft Funetion header */ 
ut Qu xe i Function body starts here */ | 


double result = 1. io m ur t Md stored here "o 






] ek. that a i met negative */ 


waa E OA | 
return o. us D o NM return */ 
bone ‘result: cen uw jd /* Seien x*x. 3 with n terms */ 
4 mi cae n Sy ME. ends here * 


Program Analysis 


This program shows some of the ways in which we can use the function 
power() in the way that arguments are specified. If you run this example, 
you will get this output: 


5.0 cubed is 125.000000 
: O cubed = 27.000 
= 729.00 


1 2 4 8 16 32 64 128 256 


You will have already gathered from some of our previous examples that 
using a function is very simple. To use the function power() to calculate 5° 
and store the result in a variable y in our example, we have written: 


y9powert $.0, 3) | X .4/* Passing constants as arguments */ 


The values 5.0 and 3 are called arguments. |They happen to be constants, 
but any expression can be used as an argument, as long as, ultimately, a 
value of the correct type is produced.| The arguments substitute for the 
parameters x and n, which were used in the definition of the function. The 
computation is performed using these values and a copy of the result, 125.0, 
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will be returned to the calling function, main(), and stored in y. You can 
think of the function as having this value in the statement or expression in 
which it appears. 


The next call of the function is actually used within the output statement: 





so the value returned from the function is used as the argument to 
printf(). Since we haven't stored the returned value anywhere, we can't 
access it or use it for any other purpose. 


The power() function is next used in the statement: 





where the function will be called twice. The first call of the function will be 
the rightmost in the expression, appearing as the first argument to the 
second call of the function. The double result, 9.0, will be returned and 
inserted as the first argument in the call of the next function, with index as 
the second argument. Since index has the value 3, the value of 9.0? will be 
computed and the result 729.0 stored in x. This sequence of events is 


illustrated here: 






index value 


Stored in x 





129 


Passing 





Arguments 





Passing Arguments to a Function 


It is most important to understand how arguments are passed to a function 
in C, as it will affect how you write functions and how they will ultimately 
operate. There are also a number of pitfalls to be avoided, so we'll look at 
this mechanism more closely. Ihe arguments specified when a function is 
called should usually correspond in type and sequence with the parameters 
appearing in the definition of the function.] If they don't, then your compiler 
should convert them so that they do, or generate an error message if this 
isn't possible. 


In C, a function has no access to the original values you use as arguments. 
An argument values are copied, and the copies are passed on to the 
unction.] Because the function is working with copies, we were able to 
decrement the parameter n quite safely, without affecting the original 
argument. This mechanism is called the pass-by-value (or pass-by-copy) 
method of transferring data to a function. 


The Pass by Value Mechanism 


[With this method, the values of the variables or constants you specify as 
"arguments aren't passed to a function at all. Consequently, a function cannot 
directly modify the arguments passed. We can demonstrate this by 
deliberately trying to do so in this example: 


pe EX5-02.C A futile attempt to modify caller arguments */. 
A A — | E 
nt AddTen( int value); = /* Function prototype */ 


int main( void ) x 
M vA a 0 00 /* Argument value to be passed */ 
printf ("\nAddTen(value) returns *d", AddTen( value)); 
, Printf("Anvalue is now *d", value); — — | 


wetarh 0 
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Program Analysis 


Of course this program is doomed to fail due to a shortage of leprechauns. 
If this program modifies the caller argument on your computer, then the 
most likely explanation is that a leprechaun sold you a dubious compiler. If 
you compile this example and run it, you should get this output: 


value starts as 3 
AddTen(value) returns 13 
value is now 3 


This confirms that the original value of value remains untouched. The 
incrementation occurred on the local copy of value, which was eventually 
discarded when we exited from the function. 


Clearly the pass-by-value mechanism provides a high degree of protection 
from having caller arguments mauled by a rogue function, but it is 
conceivable that we might actually want to modify caller arguments. There 
is, of course, a way to do this. Didn't you just know that pointers would 
turn out to be incredibly useful? 


Pointers as Arguments to a Function 


When you use a pointer as an argument, the pass-by-value mechanism still 
operates as before. However, a pointer is an address of another variable, 
and if you take a copy of this address, the copy still points to the same 
veste 


( Specifying a pointer as a parameter enables your function to get at a caller 


argument. If we change the last example to use a pointer, we can 
demonstrate this effect: 


154 


Passing 











by Value 





/* EX5-03.C Modifying caller arguments through a pointer */ 
#include <stdio.h> 


int AddTen( int* pvalue ); | /* Function prototype */ 


int main( void ) 


{ 


int value = 3; /* Argument value to be passed */ 


printf(“\nvalue starts as %d”, value); 
printf(“\nAddTen(value) returns *d", AddTen(&value)); 
printf (“invalue is now %d”, value); 


return 0; 


) 


Uf Ne ee ee ee he e e e e he le ee eee ee e e de ee e e e e e e ee e ee e 


* Function to increment a variable by 10 * 
* This works without the aid of a fairy * 


* ring or a leprechaun. * 
ONWwimwuwuuuwiwwtsewhti en ARENA ARA RRA RANAS 


int AddTen( int *pvalue ) | | s 4* Using a pointer should help... */ 

{ | : : | 

|  *pvalue += 10; 3 /* Increment the caller argument - confidently */ 
return *pvalue; | | /* Return the incremented value */ 

) | | 


Program Analysis 


In this version of the program, the function AddTen() has been modified to 
accept a pointer as an argument, and to work through the address passed 
as an argument. The prototype for the function now has the parameter type 
specified as a pointer to int, and in the function main(), the address of the 
variable value is passed to the function. The function AddTen() will still 
receive a copy of the address passed, but the copy will still point to the 
same memory location, so the variable value in main() is modified by the 
function. This is confirmed by the output from the program. 


In the rewritten version of the function AddTen(), both the statement 
incrementing the value passed to the function, and the return statement, 
now need to dereference the pointer in order to use the value. 


You can now see why scanf() needs to have the arguments determining 


where the input is to be stored, to be specified as addresses. The only way 
scanf() can modify variables in your program is if pointers are passed as 
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arguments. Equally, you should be able to see why forgetting to prefix a 
variable name in the argument list to scan£() with &, causes such 
problems. The function treats whatever you pass as an argument as an 
address, and since it has no way to authenticate it, it will attempt to store 
all the input there, regardless of its validity - perhaps writing a string 
where your operating system is stored, causing an inevitable system crash! 


Arrays as Function Arguments 


You can also pass an array to a function. In this case, however, the array 
isn't copied, even though a pass-by-value method of passing arguments still 
applies. The array name is specified as the argument, converted to a 
pointer, and a copy of this pointer to the beginning of the array is passed 
to the function. This is quite advantageous, since copying a large array for 
each call of a function could be very time-consuming, and expensive on 
memory. We can illustrate the ins and outs of this by writing a function to 
compute the length of a string that is passed to a function in a char array: 
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Program Analysis 


The function StrLength() will work with a character array of any length. 
As you can see from the prototype, there's only one parameter, array[], 
which appears without a specified dimension. When specifying a parameter 
as a one-dimensional array there is little point in actually specifying a 
dimension, since only the address of the first element is passed as an 
argument. Multi-dimensional arrays are a little different, as we shall see 
later. 


The initializing string, a quote from Samuel Johnson, defines the length of 
the array Quote[]. The initializing string is defined as two concatenated 
string constants, simply because as a single string constant, it’s too long to 
fit on the page. 


The value returned from StrLength() includes the ‘\0’, so we subtract 1 
in the argument to printf() to obtain the character count, excluding the 
"AU. 


When array[Length] contains ‘\0’, Length will be incremented once 
more and the loop will end. The final value of Length is returned as the 
count of the number of characters, including the “10”. 


If you run the example it will output the length of the string as 85, 
confirming that everything works as we anticipated. 


Example Modification 


However, we haven't exhausted all the possibilities here. As we determined 
at the outset, the array name is passed as a pointer, in fact as a copy of a 
pointer, so within the function we don't have to deal with the data as an 
array at all. We could modify the function to work with pointer notation 
throughout, despite the fact that we started out with an array in main(), 
and that the pointer passed to the function contains the address of an array. 


/* EX5-05.C Passing an array to a function and using it as a pointer */ 
#include <stdio.h> 


int StrLength( char *array ); | /* Function prototype */ 


int main(void) 


{ 
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char Quote[] - 
{ "Sir, I have found you an argument; " 
" but I am not obliged to find you an understanding." ); 


printf£("\nThe string:\n\t%s\nhas %d characters.", 
Quote, StrLength(Quote)-1); 


return 0; 


) 


f| RC ce ee e e he e ede e ehe ee e he he ke he he e e he e e he e he e e Fe e e he e e e e e e e e 


* Function to compute the length of a string * 
* including the ‘\0’. * 


kc kc ke e ke e e e e de e e kk eoe ke e e ke ke e ke e e e ke e e ke ke e e e ke ke e RK RK / 


Program Analysis 
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The prototype and the function header have been changed, although neither 
is absolutely necessary. If you change both back to the original version, with 
the first parameter specified as an array, and leave the function body 
written in terms of a pointer, then it'll work just as well. 


The most interesting aspect of this version is the while loop statement: 





where we apparently break the rule about being unable to modify an 
address specified as an array name. In fact, we aren't actually breaking the 
rule. You may recall that the pass-by-value mechanism makes a copy of the 
original array address and passes that to the function, so here we're 
modifying the copy, and the original array address will be unaffected. As a 
result, whenever we pass a one-dimensional array to a function, we're free 
to treat the value passed as a pointer and to change the address in any 
way that we wish. 


The length of the string is computed in the return statement as the 
difference between the two pointers. We saw this method of obtaining the 


Returning 








Values 





length of a string in the previous chapter (Ex4-07.c). Of course, this 
version of the program produces exactly the same output. 


Passing Multi-dimensional Arrays to a Function 


Passing a multi-dimensional array to a function is quite straightforward. For 
instance: 


char Strings[10][80]; 
You could write the prototype of a hypothetical function SortStrings(), as: 
int SortStrings(char Strings[10][80]); 


You may be wondering how the compiler can know that it's defining an 
array of the dimensions shown as an argument, and not a single array 
element. Well, the answer is simple - you can't write a single array element 
as a parameter, only as an argument. 


When defining a multi-dimensional array as a parameter, you can also omit 
the first dimension value. Of course, the function will need some way of 
knowing the extent of the first dimension. For example, you could write: 


int SortStrings(char Strings[][80], int index ); 


where the second parameter would provide the necessary information about 
the first dimension. Here, the function can operate with a two-dimensional 
array with any value for the first dimension, but with the second dimension 
fixed at 80. 


Returning Values from a Function 


All the examples of functions we have created up to now have returned a 
single value. Is it possible to return anything other than a single value? 
Well, not directly, no, but the single value returned need not be a numeric 
value. It can also be an address, providing the key to returning any amount 
of data. |You just use a pointer - but this is where the pitfalls start so you 
need to be very careful. 
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Returning a Pointer 


Returning a pointer value is very easy. A pointer value is just an address, 
so if you want to return the address of some variable called value, you can 
just write: 





and as long as the function header and prototype indicate the return type 
appropriately, then we don't have a problem. Of course, if you have a 
pointer variable with the address already stored, then you can use that in 
the return statement. Assuming that the variable value is of type long, the 
prototype of a function containing the above return statement might be: 





So let's look at a function which will return a pointer. 


A Pointer Returning Example 
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We could try to write a function that produces a factorial of an integer, the 
product of all integers from 1 to the given number. For example, factorial 4 
(usually written 4!) is the equivalent of 1x2x3x4, which is 24. You should 
know in advance that this first attempt to produce the function doesn't 
work, but press on - it's educational. 


Let's assume that we need a function to return a pointer to the factorial of 
its argument value. Our first try might look like: 





We could create a little test program to see what happens: 
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./*EX5-06.C Testing the factorial function */ 
include «stdio.h» 


dong* factorial long number ); /* Function prototype */ 


dae main(void) 
n | | 

(long mum = $n but | /* Test value */ 

> i "NULL; o ; /* Pointer to returned value */ 








ptr . factorial ( num ); 
print #(*\nractortat of 5 should be *l1d", 1L*2L*3L*4L*5L; 


“| Prine €(*\nResult + = sla", *per); I Display returned value */ 


Program Analysis 


The function main() calls the factorial() function and stores the returned 
address in the pointer ptr. This should point to a value which is the 
factorial of the argument num. We then display the result of explicitly 
computing 5! to check against the result from the function. On my computer 
I get the output: 


Factorial of 5 should be 120 
Result = 13172 


Well, clearly the second line doesn’t reflect the correct value. The error 
arises because we’re returning the address of a variable that is local to the 
function. The variable result in the function factorial() is created when 
the function begins execution and is destroyed on exiting from the function. 
The memory previously allocated to result becomes available for other 
purposes, and here it has evidently been used for something else. Here you 
must remember that there is a cast-iron rule: 








Don't even think about returning the address of a local | 
variable from a function. 





Now we have a function that doesn’t work, and we need to think about 
how we can correct it. One answer lies in dynamic memory allocation. With 
the library function malloc(), we can create a new variable in the free 
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store that will continue to exist until it’s eventually destroyed by a call to 
the function free(), or until the program ends. The function would then 
look like this: 





We need to remember to include the header file srDLIB.H to use the 
malloc() function. Rather than declaring result as of type long, we now 
declare it as 1ong* and store the returned malloc() address in it. We then 
have the necessary check that we got a valid address back, and exit the 
program if anything is wrong. The function exit() that is used here is 
from the standard library, and is declared in STDLIB.H. lt provides a means 
of terminating a program from any point. The integer argument is passed 
back to the operating system environment as a termination condition. Zero 
usually indicates a normal program termination. 


Since result is now a pointer, the rest of the function is changed to reflect 
this and the address contained in result is finally returned to the calling 
program. You could exercise this version by replacing the function in the 
previous program with this version. You will see that this now works as 
you would expect. 


However, this is a rather poor implementation of this function. It would be 
much better to return a value rather than a pointer in this case, but at least 
it shows that you can return a pointer. You need to remember that with 

dynamic memory allocation in a function like this, memory is allocated each 
time the function is called, and it's the responsibility of the calling program 
to delete the memory when it's no longer required. It's easy to forget to do 
this in practice, with the result that the heap is gradually eaten up until 


162 


Static 





VET LOS 





there's no more memory available and the program will fail. When 
allocating memory dynamically, it's good practice to free the memory within 
the scope where it is allocated. 


Static Variables in a Function 


There are some things that you can't do with automatic variables in a 
function. For example, you can't count how many times a function is called, 
because you can't accumulate a value from one call to the next. However, 
there's more than one way to get around this if you need to. A good 
solution in most instances is to declare a variable within a function as 
static. You use exactly the same form of declaration for a static variable 
that we saw in Chapter 2. For example, to declare a variable count as 
static you could use the statement: 





Initialization of a static variable within a function only occurs the first time 
the function is called. In fact on the first call of a function, the static 
variable is created and initialized. It then continues to exist for the duration 
of the program execution, and whatever value it contains when execution of 
the function is complete, is still available when the function is next called. 
We can demonstrate how this works with a simple example: 






iable within a function */ 


on prototype, no arguments or return value */ 


ed : %d times .^ , ++count 7 
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Program Analysis 


Because the increment operation is a prefix, the newly incremented value is 
displayed by the print£() function, thus will be 1 on the first call, 2 on 
the second and so on. Because the variable count is static, it continues to 
exist and retain its value from one call of the function to the next. 


Note the return statement. Because the return type of the function is void, 
to include a value would be an error. You don't actually need to include a 
return statement in this particular case. Running off the closing brace for 
the body of the function is equivalent to the return statement without a 
value, so the program will compile and run without the return. However, I 
prefer to include the return anyway. 


Recursive Function Calls 


When a function contains a call to itself, it’s commonly referred to as a 
recursive function. This may seem like a recipe for an infinite loop, and if 
you aren't careful it certainly can be. A prerequisite for avoiding an infinite 
loop is that the function contains some means of stopping the process. 
Unless you have come across the technique before, the sort of things to 
which recursion may be applied may not at first be obvious. 


However, situations that need recursion occur surprisingly often. As well as 
various mathematical functions, such as the factorial of an integer that we 
saw earlier, analyzing statements in a programming language can often use 
recursion to good effect. We shall, however, take something a little simpler 
to start with. Earlier, we produced a function to compute the integral power 
of a value, that is, compute x". We can implement this as an elementary 
illustration of recursion in action: 





double power( double x, int n ); /* Function prototype */ 


int main() 

( 
int index = 3; /* Raise to this power */ 
double x = 3.0; /* Different x from that in function power */ 
double y = 0.0; /* Store return value here */ 
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y - power( 5.0, 3 ); /* Passing constants as arguments */ 
printf(“am5.0 cubed is wf", y); /* Display the result */ 


/* Calling the function in an argument to printf() */ 
printf(“\n3.0 cubed = $.3f", power( 3.0, index )); 


/* Calling the function in an argument to a call of the same function */ 
X - power( power( x, 2 ), index ); /* Computes x to the power 6 */ 
printf("Mix = %.2E\n\n", x): 


/* Using a function in a loop */ 
for (index=0; index<=8; index++) 
printf("$6.0f", power(2.0, index) ); 


return 0; 


f eode k e hee ede e de de de ede de e dece de de de e e e e e e e e he e e ke e e e e e d x 


* A function which will compute the * 


* integral power, n, of a double value, 


* x, and return the result as double. 
ck ke ce ee ke ce ee e ke e e ke oe e ke oe ke e e ce e ke ee e e AAA € A A x x € / 


* 


* 


double power( double x, int n ) 


{ 







if(n«0) 

cn a 
o _prine£("tamegativo ind 
nom m 
(o MIR —— 
a  weturn x* 
COO" II 
S0 return 1. M 


Program Analysis 


We only intend to support positive powers of x, so the first action is to 
check that the value of the second argument, n, isn't negative. 


 if(n«0) 
print£i"Vuegative index, program | terminated. "hy 
wisi 


With a recursive implementation this is essential, since a negative value will 
cause an infinite loop. 
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The if statement provides for the value 1.0 being returned if n is zero, or 
otherwise returning the result of the expression x*power( x, n-1 ). This 
causes a further call of the function power() with the index value reduced 
by 1. Clearly, within the function power(), if the value of n-1 is greater 
than zero, then a further call of the power() function will occur. Ultimately, 
the function power() will be called with an index value of 0, so 1 will be 
returned. This will be multiplied by x at the next level, and that value 
subsequently returned. The recursive calls will continue to unwind until the 
first level will return x". For a given value of n greater than 0, the function 
will call itself n times. The operation of the function with n having the 
value 3 is illustrated here: 


power(x,3) 









double. DC 


{ 





Evaluating X ? 
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Using Recursion 


Unless you have a problem which particularly lends itself to using recursive 
functions, or if you have no obvious alternative, then it's generally better to 
use a different approach, such as a loop. This will be much more efficient 
than using recursive function calls. Think about what happens with our last 
example to evaluate a simple product, x*x*...x n times. On each call, the 
compiler will generate copies of the two arguments to the function. It also 
has to keep track of the location to return to when each return is executed. 
It is also necessary to arrange to save the contents of various registers, so 
that they can be used within the function power ().0f course, these will 
need to be restored to their original state at each return from the function. 
With a quite modest depth of recursive call, the overhead will be 
considerably greater than using a loop. 


This isn't to say that you should never use recursion. Where the problem 
suggests the use of recursive function calls as a solution, the technique can 
be immensely powerful and can greatly simplify your code. 


Pointers to Functions 


A pointer stores an address value which, up to now, has been the address 
of another variable with the same basic type as the pointer. This has 
provided considerable flexibility by allowing us to use different variables at 
different times through a single pointer. A pointer can also point to the 
address of a function. This enables you to call a function through a pointer, 
and the specific function that will be called will be the function that was 
last assigned to the pointer. 


Obviously, a pointer to a function must contain the address of the function 
to which it points, but if it’s to work properly, more information is 
necessary. It has to maintain information about the parameter list for the 
function it points to, as well as the return type. Therefore, when we declare 
a pointer to a function, the parameter types and the return type of the 
functions it can point to have to be specified, in addition to the name of 
the pointer. 
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Declaring Pointers to Functions 


Let's declare a pointer pfun, that can point to functions that take two 
arguments of type char* and int, and will return a value of type double. 
The declaration would be: 





This may look a little weird at first because of all the parentheses. The 
parentheses enclosing the pointer name, pfun, and the asterisk are necessary, 
since without them it would be a prototype, rather than a declaration. 


You can initialize a pointer to a function by including the name in the 
declaration of the pointer. Assuming we have a function defined with the 


prototype: 





we can declare a pointer to a function with the statement: 





Here, the pointer pfun is declared as pointing to any function that accepts 
two arguments of type long, and also returns a long value. It is also 
initialized with the address of the function sum(). We could now call the 
function sum() using the pointer with a statement such as: 





Here the variables total, ivalue and jvalue are all of type long. 


Of course, you can also initialize a pointer to a function with an assignment 
statement. Assuming the pointer pfun has been declared as above, and that 
we've declared and defined the function product () accepting two 
arguments of type long, we could set the value of the pointer with the 
statement: 





As with pointers to variables, you must ensure that a pointer to a function 
is initialized before you use it to call a function. Without initialization, 
catastrophic failure of your program is guaranteed. 
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Using Pointers to Functions 


To get a proper feel for how a pointer to a function operates, let’s try one 
out in a program: 


/* EX5-09.C Exercising pointers to functions */ 
#include <stdio.h> | 


long sum(long a, long b); /* Function prototype */ 
long product(long a, long b); /* Function prototype */ 


int main(void) 


( 
long (*pdo it)(1ong, long); /* Pointer to function declaration */ 


pdo it = product; 
print£("\n3*5 = %1d", pdo it(3, 5)); /* Call product thru a pointer */ 


pdo it = sum; /* Reassign pointer to sum() */ 


/* Now call sum() through a pointer - twice */ 
printf("\n3*(4+5) + 6 = %1d", pdo it(product(3, pdo it(4, 5)), 6)); 


return 0; 
) 


/* Function to multiply two values */ 
long product (long a, long b) 
{ 
return a*b; 
} 


/" Function to add two values */ 
long sum(long a, long b) 

{ 
=- return atb; 
} 


Program Analysis 


This is hardly a useful program, but it does show how a pointer to a 
function is declared, is assigned a value, and is subsequently used to call a 
function. 


After the usual preamble, we declare a pointer to a function, pdo it, that 


can point to any function with two arguments of type long, and returning 
a value of type long. The two functions we've defined, sum() and 
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product (), are consistent with this. The pointer is used to store the address 
of the function product() in the assignment statement: 





When initializing an ordinary pointer, the name of the function is used in a 
similar manner to that of an array name - no parentheses or other 
adornments are required. The function name is automatically converted to 
an address, which is stored in the pointer. 


The name of the pointer is used just as if it were a function name, and is 
followed by the arguments between parentheses exactly as they'd appear if 
the original function name was being used directly. 


Just to show we can do it, the pointer is then changed to point to the 
function sum(). We then use it again in an incredibly convoluted expression 
to do some simple arithmetic. From this you can see that a pointer to a 
function can be used in exactly the same way as a function. 


A Pointer to a Function as an Argument 


Since a pointer to a function is a perfectly reasonable type, a function can 
also have an argument that's a pointer to a function. This allows the calling 
program to determine which function is to be called from inside a function. 
You can pass a function explicitly as an argument in this case. 


We can look at this with an example. Suppose that we need a function to 
process an array of numbers by producing the sum of the squares of each 
on some occasions, and the sum of the cubes on others. One way of 
achieving this is by using a pointer to a function as an argument: 





170 





Pointers 
to Functions 





Program Analysis 


The first statement of interest is the prototype for the function sumarray(). 
Its third parameter is a pointer to a function. The pointer can store the 
address of a function that has a single parameter of type double, and 
returns a value of type double. 


We call the function sumarray() twice in main(), the first time with 

squared as the third argument, and the second time using cubed. In each 

case the address corresponding to the function name used as an argument 

will be substituted for the function pointer in sumarray(). As a result, the 
appropriate function will be called within the for loop, so that sumarray () 
will return the sum of squares in the first instance, and the sum of cubes in 
the second. : 
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There are obviously easier ways of achieving what this example does. But 
you can see how using a pointer to a function can provide you with a lot 
of generality. You could pass any function you care to define to the function 
sumarray(), as long as it takes one double argument and returns a value 
of type double. 


The example will generate the output: 


Sum of squares = 169.750 
Sum of cubes - 1015.875 


These answers are just what we'd expect, so obviously the function pointer 
is doing its job. 


Handling Command-line Arguments 


Arguments can be passed to your program when you execute it, and you 
can process these very easily When you start your program, you access 
specified arguments on the command-line through parameters to the 
function main(). There can be two parameters to the function main(), 
usually named argc and argv: 


argv 


argv[O] 
argv[2] 


argv[1] argc strings 





argv[n] 


argv[n+1] 
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The first parameter, argc (of type int), is a count of the number of 
arguments specified on the command-line invoking the program, including 
the program name. The second parameter, argv, is an array of pointers to 
character strings. The first string is the name of the program, and the 
following strings are the command-line arguments. The string containing the 
last command-line argument will be followed by an empty string containing 
just the string termination character *\0’. Since the program name is always 
present, argc is always at least 1. In a practical situation, all of the strings 
may be of various lengths. 


We can see how this works using a program that just displays the 
command-line arguments: 


/*EX5-11.C Displaying the command- line TÍ 
#include <stdio.h> | 
int wasn sts argc, char tarawi) 


i int iw0, wo La CP loan ominter ”/ 
printé("\n"); | ur i. /* Start on a new line */ 
for(; i«argc; Lis]. d Eu | vU 
pronta MargvliD; qum 0 Display a command-line argument */ 
return 0; | | 
) 


Program Analysis 


The header for the function main() specifies the two parameters argc and 
argv. Usually, they are given these names, but you could use your own if 
you wish. The for loop steps through the strings pointed to by argv[] up 
to argv[argc-1], displaying the complete command-line, starting with the 
program name and followed by each argument. 


Another Look at Scope 


We briefly looked at variable scope in Chapter 2, but we didn't go through 
the whole story, so let's rectify that now. We already know that the scope of 
automatic variables defined within a block extends from the point at which 
they're declared to the end of that block. Their existence is also limited to 
the same extent, so that at the end of the block in which they are declared, 
they are discarded. This applies to any block, including function blocks. 
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A variable defined outside of all the blocks in a program is a global 
variable. A global variable has a scope which extends from the point of its 
declaration to the end of the file in which its declaration appears. It's 
accessible anywhere in the program file, as long as another variable hasn't 
been declared with the same name in another block elsewhere in the 
program file. If it has, then the global variable is hidden by the local 
variable of the same name. We can show how scope is determined 
graphically in the following illustration: 


Program file 
Example.c 


int main() 


{ 


int value2; 





int value3; value2 
value3 


valuel 


int function(int) 






long value5; : value4 


int value1; M4 
n | ikii value5 
Pp la | 
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This shows a single program file containing two functions, main() and 
function(). The variable value1 is global, and has a scope which extends 
from its declaration point to the end of the program file. It can therefore be 
accessed anywhere in the file, except within function(), where it'll be 
hidden by the local variable of the same name. The variable value2 is 
declared in main() and has ‘function’ scope. It exists from its declaration to 
the end of main(). The variable value3 is declared in a nested block in 
main(), so its scope is limited to the inner block. 


The variable value4 is another global variable with a scope running from 
its declaration point to the end of the file. Because its declaration appears 
after main(), it can't be accessed in main(). The scope of value5 extends 
from its declaration to the end of function(). 


Function names have a scope which extends throughout the entire file, but 
either the function or its prototype must appear in the file before it is 
called. 


Multiple Source Files 


A program divided into two or more source files raises a few additional 
questions. First of all, how can a function in one file call a function that has 
its definition in another file? This is quite straightforward. Every function 
that is used in a file, apart from main(), should have a prototype defined 
in the file. This leads to the idea of a file which contains definitions and 
declarations for everything that's common across the whole program. So, in 
programs contained in multiple source files, there is usually one file which 
contains all the function prototypes. The file normally has the extension .H 
and is copied into a file with a statement: 





We know that global variables declared outside of a function continue to 
exist throughout the life of a program. It would seem reasonable to expect 
that we could access a global variable from anywhere in a program, and 
indeed we can, but we do need to tell the compiler about it. 
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External Variables 


To access a global variable from a file that is declared in another, we use 
the extern keyword. Assume that we have a variable number of type long, 
that is defined at global scope with the statement: 





To use the same variable in another file, we must include the declaration: 





This statement simply advises the compiler that the variable number isn't 
defined in this file, but in another. The previous statement defined the 
global variable number. There can only be one definition of a global variable 
in your program, although external declarations for a global variable can 
appear as many times as you want, so these are also often aggregated into 
a common .H file. 


Private Variables and Functions 


Where static is applied to a global variable or function it causes it to have 
scope only within the file in which it is declared. Where static is applied to 
a local variable, although stored as a global variable, its scope is restricted 
to the function in which it is declared. Declaring a global variable as 
static ensures its privacy to the file in which it is defined. The statement: 





limits the global variable number to the file in which this statement appears, 
and extern statements in other files for a variable of the same name won't 
be able to access this variable. 


You can also apply the keyword static in function prototypes. The 
prototype: 





limits the use of the function to the file in which this statement appears, 
and obviously the function definition must appear in the same file. The 
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function myfun() is now private to one source file, allowing the possibility 
for other parts of the program to use a different function with the same 
name. 


Summary 


You should now be thinking about structuring your programs as sets of 
functions. Using functions enables you to shorten development time by 
simplifying the units of code that you need to deal with, making them 
easier to write and test. It also enables you to reuse functions that provide 
general processing capability in multiple programs. 


The important points we have covered in this chapter are: 


MID Each function in your program, with the exception of main(), 
requires a function prototype, which should be placed at the 
beginning of the program file. 


All variables names used within a function definition are local to the 
function and can be duplicated elsewhere. 


A function with a return type other than void must contain a return 
statement. A function with a return type of void need not contain a 
return statement, but if it does, a return value mustn't be specified. 


When using a function, arguments should agree in number and type 
with those appearing in the prototype of the function and the 
function definition. If an argument type is different from the 
corresponding parameter type, then the compiler will attempt to 
convert the argument appropriately. 


You must never return a pointer to a local variable from a function. 


A pointer to a function can store the address of a function and can 
be used to subsequently call the stored function. 


A recursive function is a function that calls itself. To avoid infinite 
loops, care must be taken to ensure that a recursive function contains 
the means of ending a sequence of recursive calls. 
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Programming Exercises 
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1 


Write a function to compare two strings specified as arguments, and 


return a value of 1 if the first string is greater than the second, and 0 
if otherwise. 


Use the function from the first example to read a series of strings, 
and then sort them into descending order and display them. 


(Hint: Sort them in order by interchanging pointers.) 


Write and test a function to accept an argument between 1 and 7, and 
return the name of the day of the week as a string. 


Write and test a function to append a string onto the end of another. 
Use this function to write a program to assemble a series of input 
lines into a single string, and analyze the composite string for the 
frequency of occurrence of each letter. Try to write a function to test 
the frequency of each word as well. 


Write and test a function to accept two string arguments, and find the 


initial occurrence of the first within the second. The index value of 
the first character of the occurrence should be returned, with -1 as the 
return value if the first string isn't found within the second. 


Write a program using a recursive function to calculate the factorial of 
nm thai 129"... n 
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This is the last chapter introducing new methods for organizing data in a C 
program. The tools we are going to look at in this chapter will enable you 
to handle any kind of data structure your application may require. In this 
chapter you will learn: 


What a structure is and how it is defined. 


How you access and process members of a structure. 


How you can use pointers to organize and link a series of structure 
variables. 


What a linked list is and how it's used. 


What a binary tree is, how it's constructed and used. 


What a union is and how it can be applied. 


Chapter 6 - Data Structures 





Structures 


Although arrays are very useful, they don't accommodate reality very well 
since all the elements are essentially the same type. Most things that you 
want to deal with need a variety of data elements to describe them, often 
spanning the whole spectrum of data types we have seen so far in C. 


If you wanted to describe something quite mundane, such as a TV set, then 
it has a brand name, a screen size, is color or monochrome, tunes a certain 
number of channels, has external dimensions, weighs a certain number of 
pounds, and consumes a certain amount of power, amongst many other 
things. It would be very useful to be able to handle a varied collection of 
data items such as this, under a single variable name, perhaps TVSet, and 
be able to access the component elements defining an entity of this kind 
when necessary. This is exactly what a structure enables you to do. 


Declaring a Structure 
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A structure is a group of one or more variables of various types, identified 
by a single name. The first step in creating a structure is the definition of 
what it contains. This can then be used as a template for declaring variables 
which are instances of that particular structure. 


Let's take an example. Supposing due to the failure of our attempts to 
induce rain or otherwise control the weather, we now turn to the heavens. 
We are interested in having a variable type for planets, since we are going 
to record basic information about the solar system. We can define a 
structure for planets as follows: 





The keyword struct indicates that this is a structure. This statement 
doesn't define a variable, it just defines a template called Planet which can 
be used to define variables with those data elements appearing between the 


Declaring a 


Structure 





braces. This amounts to a new type, in this example named Planet, but 
generally called a structure tag. The variables within this template are called 
members. 


Each variable of type Planet will contain its own set of members with the 


names specified in the definition of the generic structure type. Note that a 
semi-colon is required after the closing brace in the definition of a structure. 


Declaring Variables 


We can declare a variable of type Planet with the statement: 





This declares Earth as a structure variable of type Planet, so the variable 
Earth has the data members Name[], Mass, Year, Temperature, Moons, and 
SunDistance. We can also define multiple structure variables in a single 
declaration: 





This statement declares the three variables Mars, Venus and Pluto, which 
are all of type Planet. 


Declaring Variables and the Structure Together 


We could also have declared variables within the initial statement defining 
the structure: 








Here we've defined the structure type, and declared the two variables 
Mercury and Uranus. The structure tag name can be omitted from a 
definition of structure, but obviously since you’ve no means of referring to 
the structure type subsequently, all the variables that you want of this 
structure type must be declared within this original definition. 
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You can declare structure variables within a block, or as global objects 
outside of any function. They can also be static. The members of a 
structure can be any kind of variable, including being another structure, 
although a given structure MyStruct cannot contain a structure object 
member of type MyStruct. 


Using typedef 


We can also use typedef when we define a structure. For example we 
could define the structure Planet with the statement: 





Don't confuse this with the previous declaration where we also defined 
Mercury and Uranus as instances of the structure Planet. Here we've 

defined the structure Planet, and we've also defined PLANET as a new 
name for the type struct Planet. We can now use this new name to 
define instances of the structure Planet with a statement such as: 





This declares two variables Mercury and Uranus, and is equivalent to the 
statement: 





When we use PLANET we no longer need to insert the keyword struct, so 
using a typedef can make your programs easier to read and more succinct. 


Initializing Structures 


Structure members can be initialized when they're declared in a similar way 
to arrays. The initializing values are specified between braces, and appear 
after an equals sign following the variable name, for example: 
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This declares the structure Mars, and initializes it with a variety of data 
values. The correspondence between the values specified and the members 
of Mars is as follows: 


char Name [80] : 


double Su 





The initializing values must appear in the order that corresponds to the 
sequence in the structure type definition. 


Using Structures 


The ways in which you can use a structure as a whole are actually quite 
limited. You cannot compare structures, or use one in an arithmetic 
expression, but you can assign one structure to another of the same type. If 
we define a structure object like this: 





then we can write the assignment: 





This will result in member-by-member copying from the object Mars to the 
object RedPlanet. So after this statement, the data members of Mars and 
RedPlanet will be identical. 
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The only other operations you can carry out on a structure object are to 
take its address using the & operator, and to pass it to a function as an 
argument, or to return it as a value from a function. However, this isn't 
quite so restricting, since we can do just about anything we like with the 
individual members of a structure - as long as they aren't structure objects 
themselves, of course. 


Using Members 


You can refer to individual members of a structure by using the structure 
member operator, which is a period. For example, to set the member Mass 
in the structure Earth, you could use the statement: 





You can also use a structure member just like a variable of the same type. 
For example, if we've declared a variable MassRatio as a double, we can 


type: 





Naturally, the values of the structure members involved in this statement 
must have previously obtained values from somewhere, for the calculation 
to work. 


Using Members that are Structures 


Where a member of a structure is another structure, we can still access the 
members of the second structure. Let's take a geometric example, suppose 
that we define a structure for a screen co-ordinate object as: 





A Point object will contain the pair of co-ordinates x and y, which are both 
of type double. We could define and initialize two Point objects with the 
statement: 
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struct Point Pl={1.0, 1.0), P2={5.0, 5.0); 


so P1 has the coordinates 1.0,1.0, while P2 takes the coordinates 5.0,5.0. We 
can now define a structure to represent lines with the definition: 


struct Line 
{ "T 
struct Point Pl; 
|» struct Point P2; 
H- cum M E 


This defines a Line object as a pair of Point objects which are also 
structures themselves. We could now declare and initialize a Line object L1, 
with the statement: 


struct Line Li=(Startpt, Endpt); /* Line defined by two points */ 


Assuming that we've declared another point P3, we could assign the value 
of a member of L1 to it: 


P3=11.P2; 


This will copy the members x and y of the structure variable L1.P2 to the 
corresponding members of P3. 


If we now wanted to alter one of the members of the P1 member of the 
Line structure L2, then we just use a second level of the structure member 
operator: 


L1.P1.x *m u 
L1.Pl.y = L1.P1.x*3; 


The first statement increments the x member of the P1 member of the Line 
object ni. The second statement assigns the y member of the member P1 of 
Line object L1 to 3 more than the current x member. This diagram shows 

how the Line L1 and its members are referenced: 
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L1 
Point P1 A L1.P1 

double x; a L1.P1. 
Point P2 L1.P2 

double x; | L1.P2.) 


double y; 





Structures as Function Arguments 


Using structures with functions provides a very powerful combination. We 
can define a function to calculate the length of a line by passing a Line 
object as an argument, and returning the value of the length. The function 
definition would be: 





The calculation of line length is illustrated here: 








y2 


y1 
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The calculation of the distance between two points uses Pythagoras’ 
Theorem which you'll remember from high school, about the square on the 
hypotenuse being equal to the sum of the squares of the other two sides. 
The sqrt() function is a standard library function which accepts an 
argument of type double, and returns its square root as a double value. 
You need to include the header MATH.H to use it. 


Structures as Return Values 


Returning a structure from a function isn't a problem either, because you 
can write a function to create objects of a particular structure type. A 
function to create a Point object from two double arguments could take 


the following form: 


/* A function to create a point 
struct Point CreatePoint (double 


{ 
struct Point aPoint; /* 
aPoint.x = x; £? 
aPoint.y * y; pe 
return aPoint; 

) 


ef 
x, double y) 


Local Point object */ 
Set x coordinate */ 
Set y coordinate */ 
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The Point object is declared local to the function and initialized with the x 
and y values passed to it. A copy of this Point object is returned from the 
function. You can use this function to set the value of a Point object with 
the statement: 





We could also write a similar function to set up a Line object, although in 
this case we might want to write two functions to take care of the different 
possible options: 





These two functions enable you to create a Line object from two points, or 
from two pairs of co-ordinate values from the two points that define the 
line. The second function calls the first function as well, just to show that 
it’s possible. You could also implement the second function by using the co- 
ordinate values to directly set the x and y values of the Point objects 
contained in a Line object. 


We shall now see how structures work in practice by trying them out in a 
complete example. Let's stay with the geometric context for the time being, 
and look at an example to calculate the intersection of two lines. 
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An Example 


Before we write the program we must look at how we're going to perform 
the calculation. A line defined by two points P1 and P2 with respective co- 
ordinates x1,y1 and x1,y2, can be represented in the form: 


P = P1 + (P2-P1)t 


This is called the ‘parametric form’ since points on the line are defined by 
values of the parameter, t. When t is zero the value of P is P1, and when 
t is 1 then P is the point at the other end of the line P2. Intermediate 
values of t between 0 and 1 define points on the line between P1 and P2. 
Values of t less than zero define points on the line extended beyond P1, 
and values greater than 1 define points beyond P2. We can define the co- 
ordinate value for points on the line by this pair of equations: 


x = xl + (x2-x1)t 
y= yl + (y2-y1)t 


The relationship of these equations to the line from P1 to P2 is shown here: 


X = X1+(x2-x1)t 
x = y1+(y2-y1)t 














1 
y2 P2 (x2,y2) 
0 | 
< 
> 
y1 
EZ, cn 
in B5 in — EA NA rs chien TT 
x1 X-Axis x2 
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The diagram shows that the horizontal distance between the points P1 and 
P2 is x2-x1, and the vertical distance is y2-y1. When the parameter t is 0, 
the two equations define the coordinates of the point P1, and when t is 1 
they define the coordinates of P2. Points on the line between P1 and P2 will 
be defined by values of t between 0 and 1. 


Using this representation, it’s very easy to obtain the intersection between 
two lines. Given a line ni defined by points P1 and P2, and a second line 
L2 defined by points P3 and P4, you just need to solve the equations 
resulting from equating the x and y values in the two line definitions: 


x1 + (x2-x1)t1 = x3 + (x4-x3)t2 
yl + (y2-y1)t1 = y3 + (y4-y3)t2 


Without going through the gory details of getting there, if you solve these 
two equations, the value of t1 defining the position of the intersection point 
on the first line is determined by: 
tl=((x4-x3) (y3-y1)- (y4-y3) (x3-x1)) /((x4-x3) (y2-y1) - (y4-y3) (x2-x1)) 
This looks a bit messy because the expression for ti involves the x and y 
co-ordinates of the four points defining the two lines, but it boils down to 
just one expression - the numerator: 

( (x4-x3) (y3-y1)- (y4-y3) (x3-x1)) 


divided by another - the denominator: 


((x4-x3) (y2-y1)-(y4-y3) (x2-x1)) 


If the denominator is zero in the expression for t1, then the two lines are 
parallel, otherwise they've got to intersect at some point. In practice, it's 
better to test whether the magnitude of the denominator is less than some 
suitably small value, 10° say, to avoid numerical problems in the geometric 
calculations. We will do this in the following example: 
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struct Point /* Structure for a point */ 

( So a 

| double x; /* x coordinate */ 

| double y; /* y coordinate */ 

struct Line /* Structure for a line */ 

C : 

E struct Point Pl; /* Start point for the line */ 
struct Point P2; /* End point for the line */ 


u 


/* Function prototypes */ 

struct Line GetLine(void); 

struct Point GetPoint (void); 

int Parallel(struct Line L1, struct Line L2); 

struct Point Intersection( struct Line L1, struct Line L2); 


int main(void) 
A 


struct Point aPoint; /* Declare a point object */ 
struct Line L1, L2; /* Declare two line objects */ 
printf ("\nWe need the two points defining the first line."); 
L1 = GetLine(); 
| printf ("\nWe need the two points defining the second line."); 
| Là = GetLine(); 
| if(Parallel(L1,L2)) 
|. printf£("AnLines are parallel - no intersection."); 
else 
. , «Point = Intersection(L1,L2); 
v printf("\nThe intersection point is *.3f,* .3f", aPoint.x, aPoint.y); 
a = 
c setum o 
) o - 


eee 
. * A function to read in a point * 
(OW ee e ee ee e e e od ee e e e e e o / 
struct Point GetPoint (void) 


|J struct Point aPoint; /* Local Point object */ 
| printf("\n Enter the coordinates of a point:"); 

|. Scanf("*lf*lf", &aPoint.x, &aPoint.y); 

| return aPoint; 
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Program Analysis 


The definitions for the structures Line and Point appear at global scope, so 
they're accessible throughout the program; any function can declare instances 
of either of these structures. If a structure were to be defined within a 
block, then it would only be accessible within that block. 


Note that the definition of the Point structure must be placed ahead of the 
definition of the Line structure, because the Line structure contains objects 
of type Point as members. The definition of a structure must always 
precede its use. If you reverse the sequence of the definitions then you're 
guaranteed compiler error messages. The Point object names have been 
changed compared to our previous definition because a couple of the 
arithmetic statements are rather cumbersome, and we want to keep them as 
short as possible. 


The functions GetPoint() and GetLine() are used to input data defining a 
structure object, and to return the object once it has been created. The 
GetLine() function uses GetPoint() to construct the Point objects, and 
then returns the Line object constructed from the Point objects. Creating a 
function to input values defining a structure is a very useful technique, 
especially with complicated structure types. You can package all the input 
processing and data validation into a function, so that reading a structure 
into your program is more easily managed. 


The Parallel() function checks whether the two lines are parallel using a 
direct implementation of the expression we saw earlier. It obtains the 
absolute value of the denominator by using the standard library function 
fabs() and compares it against 10”. The function returns 1 if the absolute 
value of denom is less than 10” indicating that the lines are parallel or 
almost parallel, and 0 if otherwise. 


The function Intersection() accepts two Line objects as arguments and 
returns a Point object which represents the intersection of the two lines. 
The Intersection() function first checks whether the lines passed as 
arguments are parallel, since they would cause an error by attempting to 
divide by zero. If the lines are parallel, a message is displayed and the 
program is terminated. 
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Although a user of this function should verify that the lines 
aren't parallel before calling the function, we can't be sure that 
this will always be the case. By putting this check here, there 
will be an active detection of the error. 









The object aPoint is local to the function Intersection() and is destroyed 
when you exit from the function, but a copy of it is returned so there's no 
problem here Again, the expression for calculating t is a direct 
implementation of the equation we saw earlier. Don't worry if you can't 
sort out the algebra - it isn't important at the moment. 


Note how we can access and use the members of the Point members of a 
Line object just like any other variable. What you can do with a structure 
member is determined by its type. You can use it in the same way as any 
other variable of the same type. 


If you compile and run this example some typical output would be: 


We need the two points defining the first line. 
Enter the coordinates of a point:1.5 2.5 


Enter the coordinates of a point:6.0 8.0 


We need the two points defining the second line. 
Enter the coordinates of a point:3.0 5.0 


Enter the coordinates of a point:4.5 -3.0 


The intersection point is 3.102, 4.458 


In the case when the lines are parallel, a message is displayed and the 
function computing the intersection point isn’t called. 


Arrays of Structures 


Once a structure type has been defined, you can also declare arrays of that 
structure. Given our structure type Point, an array of points can be 
declared with the statement: 
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This declares an array of 10 elements of type Point with the name 
MyPoints[]. Referring to members belonging to a particular element of the 
array is much the same as referring to members of a single structure 
variable. To increment the member x of the third array element, you could 
write: 





Using an element of an array of structures is governed by exactly the same 
rules that apply to a single structure object. The only operations you can 
perform are to assign it to another object of the same type, to take its 
address, to pass it to a function as an argument, or return it from a 
function. 


Using Pointers with Structures 


We saw at the outset that we can obtain the address of a structure variable. 
We could declare a pointer to a structure of type Line with the statement: 





Now we have the variable prine that can store the address of a Line 
structure object. If we’ve declared a structure aLine, then we can store its 
address in the pointer in the standard fashion with the statement: 





Line in pLine */ - 


If you dereference a pointer to a structure, *pLine, you are referring to the 
structure at the address contained in the pointer. We can use this to access 
the members of the structure. 


Accessing Structure Members through a 
Pointer 


We can use the member selection operator that we've already seen to refer 
to the member of a structure through a pointer containing its address: 
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The parentheses here are essential. Without them you would be attempting 
to dereference the structure member P1, since the dereference operator is of 
lower precedence than the member selection operator, and the expression 
would be taken as *(pLine.P1). Because this is a slightly awkward 
notation, C provides a special operator, the indirect member selection 
operator, which you can use when accessing members of a structure through 
a pointer. You could use this operator to rewrite the last statement as: 





This has exactly the same effect as the previous statement. As we will soon 
see, there are compelling reasons for using pointers to structures, so this 
notation appears quite frequently in C programs. 


Structures and Functions 


We have seen in the last program that we can pass a structure to a function 
as an argument, but there is a significant potential overhead in doing this 
because of the way arguments are passed to functions. As you know, 
arguments are passed by value in C, so a copy of each argument is 
produced, and passed on to the function. 


We've seen that when you pass an array to a function, the name of the 
array is converted to a pointer, and a copy of the address of the array is 
used as the argument. 


Structures are handled differently, because in this case a copy of the entire 
structure is made, and that is passed to the function. The same occurs when 
you return a structure from a function, so that with a large structure a lot 
of copying can take place. Even the Planet structure which we defined at 
the beginning of this structure would involve a lot more overhead than an 
array when passed as an argument to a function. 


The answer is to use pointers; we can even construct structures dynamically 
within a function by allocating memory on the heap. 


Creating Structures 


We can obtain the size of a structure using the sizeof operator. To allocate 
memory for a Line object on the heap, you would write: 
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_PLine=(struct Line *)malloc(sizeof (struct Line)); 


The argument to malloc() uses sizeof to obtain the number of bytes 
required to store a structure of type Line. The pointer to the memory 
allocated that is returned from malloc() will be of type void *, so we 
need to convert this to the type ‘pointer to a Line structure’. This is done 
by the cast (struct Line *). The result of this cast is stored in the 
pointer pLine. Needless to say, in practice you must check that you do get 
a valid pointer back from malloc(). 


With the memory allocated on the heap, we can use the pointer to initialize 
the members of the structure: 


 pLine-»PleaPoint; | 


This statement initializes the P1 member of the new structure with the 
Point object, aPoint. 


Let's see all this in action with a rewrite of the last example. We can make 
it work much more efficiently by using pointers, and we can create 
structures dynamically on the heap. 


An Example Using Pointers to Structures 


The calculations for determining if the lines are parallel, and to obtain the 
intersection point are exactly the same as in the previous example, but are 
now expressed using pointers: 


/* EX6-02.C Calculating the intersection between two lines */ 


‘#include <stdio.h> a —/% For input and output */ 


include <stdlib.h> | s /* For malloc() */ 

/* Structure definitions */ uu o 

struct Point | /* Structure for a point */ 

t ue v 
double x; /* x coordinate */ 
double y; /* y coordinate */ 

) 

struct Line | s /* Structure for a line */ 

t ds n | o 
struct Point *pP1; — /* Pointer to start point for the line */ 
struct Point *pP2; /* Pointer to end point for the line */ 


199 








Pointers to 








Structures 





scanf("9*lf*lf", &pPoint-»x, &pPoint-»y); 
return pPoint; 
) 


f 58 e eee ee he ee e e e e ee e v e e d à 


* A function to input a line * 
Woo deo ee e e e e e e e e ede e e de e dede n / 


struct Line *GetLine(void) 
{ 
struct Line *pLine; 
/* Get memory for Line object */ 
pLines(struct Line *)malloc(sizeof (struct Line) ); 
if (pLine==NULL) 
{ 
printf("\nPoint memory allocation failed. Program terminated"); 
exit(1); 
} 


pLine->pPl=GetPoint (); 
pLine->pP2=Get Point (); 
return pLine; 

) 


f 5 H9 ehe ee eee ee ee ee e e ee e ee ee e e e e e e e ee e e e e e e e e e e e e e 


* A function to determine if two lines are parallel * 
Www Www wh wh Wes We esee dee fee feheohe ARRE ee eee eee ee ED / 


int Parallel(struct Line *pL1, struct Line *pL2) 


t 
double denom; 
denom= (pL2->pP2->x - pL2-»pP1-»x)*(pL1-»pP2-»y ~- pL1-»pP1-»y)- 
(pL2->pP2->y ~- pL2-»pPl-»y)*(pL1-»pP2-»x - pL1-»pP1-»x); 
if (denom<0) 
denom = -denom; 
return denom< 0.0000000001; 
) 


f 5C echec ee eee e e e e e ee e e e e e e ee e ec e eee e e e e ee e e e e e e e e e e e e e e e e e e e EK 


* A function to return the intersection point of two lines * 
NO He e ee e oe e e e e Wee e e e e e e e e e e e e e e e e e e e e e e e e e de de e ee e de e e e o / 
struct Point Intersection( struct Line *pL1, struct Line *pL2) 
{ 
struct Point aPoint; /* Local store for intersection point */ 
double t; /* parameter to define intersection point */ 


/* Get parametric value for intersection of L1 and L2 */ 

/* First calculate the numerator */ 

tz(pL2-»pP2-»x - pL2->pP1->x)*(pL2->pP1->y - pL1-»pP1-»y) - 
(pL2-»pP2-»y ~ pL2->pP1->y)*(pL2->pP1->x - pL1-»pP1-»x); 
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Program Analysis 


The structure Line now only contains pointers to Point objects as members. 
It assumes that the Point objects will be created elsewhere, and that their 
addresses will be stored in the structure. This reduces the size of the Line 
structure and means that the indirect member selection operator needs to be 
used to refer to the x and y members of the points defining the line. 


The function GetPoint() that reads in the defining data for a Point object 
also allocates memory on the heap for the Point object, and returns the 
address of the memory allocated once the object has been initialized with 
the data values read. The address returned by malloc() is cast to type 
‘pointer to Point’ before being stored in the variable pPoint. The function 
GetLine() also creates a Line object on the heap and returns its address 
once it has been initialized. 


Managing Memory for Dynamic Structures 


Where you have objects such as a Line object which has members that 
point to other objects defined on the heap, it's important to manage the 
release of memory correctly. The function Delete() does this in our 


202 





Memory 


Management 





example. It first releases the memory for the two point objects, and then 
releases the memory for the Line object. If you were to just to release the 
memory for the Line object, then the two Point objects defining it would 
still exist on the heap, and there would be no way of subsequently releasing 
this memory. In our example it doesn’t matter since all memory is returned 
to the heap at the end of the program, but if you had a program which 
regularly created and destroyed Line objects, then the heap would gradually 
be occupied by more and more surplus Point objects. The effective 


difference is illustrated here: 






free(pLine); 


Point P2 
X 


y 






Now these 

cannot be 
deleted since 
pP1 and pP2 
are no longer 

available 






Point P2 
X 


y 








The Heap 






free(pLine->pP1); B y 
| Point P2 
y" 






free(pLine->pP2); 





free(pLine); 
| Paint P2^ 
All objects K x 
are now y^ 
deleted “il 


The Heap 


The functions Parallel() and Intersection() in the example now have 
their parameters declared as pointers, so just a copy of a pointer to a 
structure is passed to the function for each argument. The computations in 
the function use two levels of the indirect member selection operator to get 
to the co-ordinate values of the Point objects defining each Line object, but 
you should have no difficulty in seeing how this relates to the previous 
version. The expression pL1-»pP1 accesses the pointer pP1 which is a 
member of the Line object pointed to by pri. Therefore pL1->pP1->x refers 
to the member x of the Point object pointed to by pP1 in the Line object 


pointed to by pL1. 
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Data Organization Using Structures 


In the last example we saw how the structure type Line could have 
pointers to other structures as members. A structure can also have a pointer 
to an object of the same type as itself as a member. This provides us with 
some very powerful techniques for managing data in a program. 


Linked Lists 


A structure with a member that is a pointer to an object of the same type 
as itself enables you to daisy chain objects together. This is very useful 
when you don't know exactly how many objects your program will need to 
deal with. For example, we could define a structure Phone as follows: 





This can provide a basis for storing names and associated telephone 
numbers. Each object of type Phone will contain a pointer to a name, a 
pointer to a telephone number, and a pointer to another Phone object. We 
can construct a chain of objects of this type, as in this diagram: 


213 111 2222 213 111 3333 213 111 4444 


Jane Jack Jean 








NULL 
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A chain like this, with each object pointing to the next in the chain, is 
called a linked list. As long as you know where the first object in the chain 
is, you can get to any object in the list by following the chain of pNext 
pointers. The last pNext pointer will contain NULL, so when you find a NULL 
stored in pNext, you know that you've reached the last object in the chain. 
You can use this technique with any kind of structure; you just need to add 
a pointer member to a structure to allow objects to be linked together. We 
can see how this works through an example. 


Using a Linked List 


This example will read a series of names and associated telephone numbers 
which it will store in a linked list using the struct definition Phone. Input 
will end when you enter a blank line, and the program will then produce a 
list of the names and numbers you have entered. Before we get into the 
code, let's think about how this is going to work. We need to perform three 
separate steps: 


WE Read the input 
E v Display the list 
"ED Clean up the heap 


Lets now consider each of these in turn. 


Reading the Input 


The basic unit of input is a name plus a number from which we need to 
construct an object of type Phone. We could implement reading the data for 
a single object in a function which will create the Phone object on the heap, 
and return the address of the object. If an object isn't created then the 
function can retum NULL. The prototype of this function will be: 





trust Phone *GetPhone(void); 5.000000 NOV 


We can create the linked list in another function that uses GetPhone() to 
read the data and construct each Phone object. The function will only need 


to worry about linking the Phone objects together and returning the address 
of the first object in the list. The prototype of this function will be: 


struct Phone *CreateList (void); 
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Cleaning Up the Heap 
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The task of displaying the list once it's complete falls naturally into the lap 
of another function. All it needs to know about is the address of the head 
of the list, so we can pass that as an argument. No return value is 
necessary as all we are doing is writing to the screen. We can write the 
prototype of this function as: 





The last operation we need to perform is cleaning up the heap. This will 
also fit into a single function very well, and given the address of the head 
of the list, all it needs to do is walk through the objects in the list, deleting 
the name and number strings from the free store before deleting each Phone 
object. 


Now let's have a look at what the code looks like: 
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Program Analysis 


Because our functions do all of the work, the function main() is very 
simple. All it does is output a prompt and then call the functions to create 
the list, display the list, and delete the list. 


Creating the List 


The CreateList() function calls GetPhone() to obtain the first list object 
and store the address returned in the pointer variable pHead, which is 
where we keep the head of the list. The address of the first object is also 
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saved in the pointer pCurrent which we'll use to store the address of the 
current object as we extend the list. 


CreateList() then calls the function GetPhone() in the while loop which 
continues as long as valid addresses are returned for new Phone objects. 
When a new object is received, the first action is to store its address in the 
pNext member of the last object, which has its address saved in the pointer, 
pCurrent. We then make the new object current by storing its address in 
pCurrent. Ás soon as a NULL is returned from the function GetPhone(), the 
list is complete and the loop ends. 


Library Functions to Handle Strings 


The header file sTRING.H contains definitions necessary to use string 

processing functions provided by the standard library. We use two of these 
in the function GetPhone(). One is strlen(), which returns the length of 
the string pointed to by its argument, excluding the "wo". Its prototype is: 





iL Velia (eias ee oo ewe 


You will remember that size t is the type of value returned by the 
operator sizeof, and is equivalent to unsigned int. 


The other is strepy() which copies the string pointed to by its second 
argument, to the char array address given by its first argument. Copying 
continues until *X0* is found, which is also copied. The prototype of this 
function is: 





char *strcpy(char *pToString, char *pFromString); 

Reading Phone Objects 
The function GetPhone() first reads a name string into the array Buffer[]. 
If its length, returned by strlen(), is zero, then an empty string must have 


been entered - so input ends and NULL is returned. For a non-zero length 
name, memory is allocated on the heap to store a Phone object. 


Note how the address returned from malloc() is cast to the required type, 
before storing it in pPhone. Note also how in the call to the function 
malloc() we use the expression sizeof (struct Phone) to specify the 
space required. This is most important with structures, because you cannot 
rely on adding up the lengths of the members to determine the number of 
bytes required to store a structure. 


209 


Chapter 6 - Data Structures 





On many computers, variables of two bytes or more are subject to 
boundary alignment, which means that the address in memory of 2-byte 
variables must be a multiple of 2, the address of a 4-byte variable must be 
a multiple of 4, and so on. As a result, if a 4-byte variable follows a 2-byte 
variable, then it may be necessary to leave two bytes unused to ensure 
correct boundary alignment. This is illustrated here: 


To: O MU Ss Lira a AA the” e LN Ow be yy | 
ANECA PLE US PRESS EE I Cosa ciui omes 


1000 1004 1008 100C 





: With this sequence of a 2-byte variable, a 4-byte variable, a 1-byte variable, 
4 and then another 4-byte variable, a total of 5 bytes can’t be used. Thus, a 
3 struct with these variables as members will require 16 bytes of memory, 
even though only 11 bytes are used to store data. 


Next, the function GetPhone() obtains sufficient memory from the heap to 
exactly accommodate the string, and the name is copied to it using 
strcpy(). The address is then stored in the pName member of the new 
Phone object pointed to by pPhone. The telephone number is then read, and 
is processed in the same way as the name. Finally, after setting the pNext 
member of the Phone object to NULL, the address of the object is returned. 


Displaying the List 


The DisplayList() function shows how easy it is to go through a list. The 
address of the first element is passed to the function, and this is used to 
control the while loop. The loop only has two actions: display the current 
object, and then copy the address of the next object to pList. As soon as 
the pointer pList is NULL, we've processed the last object and the loop 
ends. Of course, pList is a copy of the original address passed to the 
function, so there's no problem with changing it as we go along. 


: Deleting the List 


The function DeleteList() walks through the list deleting objects. Note 
how before deleting each object, the memory occupied by the strings 
pointed to by the members of each object is freed first. The address 
contained in the pNext member of each object is obtained before the 
memory for the object itself is freed. 
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Doubly Linked Lists 


One limitation of the linked list we have just seen, is that you can only go 
through it one way, from the first to the last. To retrieve any member of the 
list you must start at the beginning and trawl through the list until you 
find the one you are looking for, even if you may know that it’s near the 
end. Even if the member you want to retrieve is just ahead of the one you 
found last, you must still go right back to the beginning of the list to find 
it. One way of improving the situation is to add an extra pointer to each 
member that points to the preceding member. If we modify the Phone 
structure to accommodate this, its definition will be: 





A linked list of objects with backward- and forward-pointing pointers is 
called a doubly linked list. It can graphically be represented in this 
illustration: 


2131211 2222 213 111 3333 213 111 4444 





Jane Jack Jean 








NULL 





NULL 





With this arrangement, if you know the address of the last object in the list, 
you can work backwards through the list using the pPrevious pointer 
members. The object at the head of the list has its pPrevious member set to 
NULL. From any position in the list you can move backwards or forwards so 
that searching objects randomly will be a lot faster that the simple linked list. 
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We could use a doubly linked list in the previous example but this time we 
will automatically construct the list in alphabetical order. Apart from 
updating the structure definition, the only changes necessary are to the 
functions GetPhone() and CreateList(). 


The new version of GetPhone() will look like this: 


struct Phone *GetPhone(void) 
( 


struct Phone *pPhone=NULL; 
/* ...code exactly as before */ 


DEhona-*pNexteNULEL -j+ get next pointer to MULL Of 





return pPhone; tw /* Return a pointer to the object */ 


) 


Well that doesn't look too strenuous, does it? Just one statement added to 
initialize the pPrevious pointer. Let's take a look at the new version of the 
CreateList() function: 


struct Phone *CreateList (void) 


( 


struct Phone *pHead=NULL; /* Pointer to head of the list */ 
struct Phone *pCurrent-NULL;  /* Pointer to current Object. 
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|  PPreviousspInsert; 
|  PInsertspInsert-»pNext; - 







| /* Update pointer to previous */ 





* 






return pHead; 






ai Return the pointer to the start of the list */ 


We've had to make some quite radical changes here. This is because we 
need to search the current list every time we add a new Phone object to 
see where it fits. We do this in the inner while loop which uses the 
standard library function stremp() for comparing two strings. Its prototype 
is: 


dnt stromp(char *p81, char *p82); — 


It returns a negative integer if psi is less than ps2, a zero if the strings are 
equal, and a positive integer if ps1 is greater than psa. 


The process of adding an object to the list involves dealing with three 
situations: 


1 Adding to the head of the list. 
2 Adding to the middle of the list. 


3 Adding to the end of the list. 


The first two are handled within the inner while loop, and the third after 
exiting the inner loop. Let's now take a look at each of these possibilities in 
turn. 


Adding to the Head of the List 


The first case arises when the name for the new object should come before 
the name for the first object in the list. This position is illustrated here: 
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pCurrent pinsert 
Points to Points to an object 
the object in the list - because pPrevious 
to be inserted is NULL it must be the last 


NULL 





Adding to the Head of the list: 


pCurrent Set to the pinsert 
original head TE 


; --- Of the list ----------- P 
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NULL object being 
inserted 











The pPrevious pointer holds the address of the object in the list preceding 
the one indicated by the pointer pInsert, so when pPrevious for the 
current object in the list is NULL, we're inserting the current Phone object at 
the head of the list, so pInsert must contain the address of the first object 
in the list. To insert the new object pointed to by pCurrent, we need to do 
the following: 


"WE Set the pNext pointer member of the new Phone object to the 


address in pInsert, since pInsert points to the object previously at 
the head of the list. 


"VE Set the pPrevious member of the object pointed to by pInsert, 
which is the old head of the list, to the address of the new object. 


"WE Set the pPrevious pointer of the new object to NULL. 
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Adding to the Middle of the List 


In the second case, where we're inserting the new object in the middle of 
the list somewhere, we have to break the chain, as illustrated here: 


pCurrent 


This object is to be 
inserted preceding 
this object 





pPrevious plnsert 





pCurrent 


Set this to 
Adding to the Middle: Set this to J — point here 
point here poem ! 












to point to the 
new object 


The new object is linked to the object pointed to by prnsert in the same 
way as the previous case. The links to the object pointed to by pPrevious 
also have to be set, so the pNext pointer in the pPrevious object is set to 
pCurrent, and the pPrevious pointer of the new object is set to pPrevious. 


Adding to the End of the List 


The last case arises if we pass completely through the current list without 
finding two objects between which the current one should be inserted. This 
situation is illustrated in the diagram over the page: 
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Since the pNext member of the new object is set whenever the new object 
is inserted into the list within the loop, we can detect the occasions when 
we pass completely through the list without inserting the new object by 
checking if pNext is NULL. In this case pPrevious will contain the address 
of the last object in the list, so this is used to link the new object at the 
end of the list. The pNext pointer of the last object in the list is set to the 
address of the new object stored in pCurrent, and the pNext pointer for the 
new object is set to NULL. The pPrevious pointer for the new object is set 
to point to what was the last object in the list. 


Displaying and Deleting the List 


The DisplayList() and DeleteList() functions work as before. The 
output will now be in alphabetical order. If you want a bit of practice with 
a doubly linked list, try rewriting the output function to present the list in 
reverse order. You can get the address of the tail of the list by passing 
through it once from the beginning, following the pNext pointer members 
until you find NULL. 
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The linked list is fine for many applications, particularly as it's so easy to 
create a sorted list from the outset, but as soon as you need to do any 
searching, it can be a bit slow. For n objects in a list that you've created in 
order, you will need n/2 comparisons on average to find a particular object. 
With a doubly linked list it reduces to n/4, if you can work out in which 
half of the list the search target is to be found, but it's still proportional to 
the number of items in the list. 


An alternative is to use a structure with two pointers to structures of the 
same type, where one pointer points to an object that is less than the 
current object, and the other pointer points to an object that is in some way 
greater than or equal to the current object. The terms ‘less than’ and 
'greater than' can be defined to suit the objects concerned, and the subject 
of the comparison is whatever you want for your application. 


In the case of strings, ‘less than’ would usually be interpreted as earlier in 
alphabetical sequence, or, if one string is the same as the other except for 
characters appended to it, the shorter string. The library function strcmp() 
for comparing strings uses this meaning, and we'll be using this function a 
bit later on in this chapter. Objects of this kind of structure can be arranged 
in a structure called a binary tree. We could redefine the Phone structure to 
allow this kind of arrangement: 


| /* Pointer to a name */ CO 
| /* Pointer to telephone number */ 
Left; . /* Pointer to Phone object<current */ 
jht; = /* Pointer to Phone object>=current */ 








A binary tree of Phone objects is illustrated on the following page: 
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NULL NULL NULL NULL 


Objects in a tree are usually referred to as nodes, and the first node is 
called the root. Other than the root node, each node in a tree is pointed to 
by one other node called its parent. The root node in the tree shown above 
contains the name Jack. The node pointed to by the pointer pLeft has a 
name less than Jack, Helen, and the node pointed to by pRight has the 
name Steve, which is greater than Jack. The shape of the tree will depend 
on the sequence in which nodes were added, and the shape will determine 
how long it takes to search the tree for a particular node. Searching this 
tree involves a maximum of 4 comparisons, and less than 3 comparisons on 
average, whereas a linked list with the same objects would require a 
maximum of 8 comparisons and an average of 4. 
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Let’s rewrite the last example to use a tree instead of a doubly linked list, 
and add the capability to allow a search for the number of a particular 
person. 


An Example Using a Binary Tree 


This is going to be quite a long example so let’s build it piecemeal. As with 
previous examples we will ignore error checking in the interests of being 
reasonably concise. First we should get the process mapped out for the 
program in terms of functions. The program is going to perform the 
following distinct operations: 


Build the tree from a series of data items entered from the keyboard. 
Search the tree for a given name and return the corresponding number. 
Display the complete tree in alphabetical order. 

Delete the tree. 


A QN — 


We can map these directly to these function prototypes: 





The function to build the tree returns a pointer to the root node. The 
function to search the tree accepts two arguments, a pointer to the name to 
be found, and a pointer to the root node of the tree. The functions to 
display the complete tree and to delete the tree only require the pointer to 
the root node to be passed as an argument. 


Creating the Tree 


Since we need to read the data to create Phone objects which we will 
organize into a tree, we can use a modified version of the GetPhone 
function from the previous example: 
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This is almost identical to the earlier version, the only difference being the 
two pointers pLeft and pRight which are initialized with NULL. It returns a 
pointer to the new object, or NULL if an empty name string was entered. We 
can now use this function in the function CreateTree(). 


We can create the tree in two steps. Since there should be at least one 
Phone object, we can read the first one and use it as the root node. We'll 
then have a tree with one node. Any other Phone objects can be inserted 
into the initial tree using a standard approach that we can package in a 
function called InsertNode(). This will need two parameters, a pointer to 
the new Phone object to be inserted, and the pointer to the root node of the 
existing tree. Its prototype will therefore be: 





Using Binary 





Trees 





We need to think about how a node is inserted. With a tree of one node it 
is simple. If the name member of the new object is less than the name 
member of the root node then we plug it into the pLeft pointer, and if it 
isn't we plug it into the pRight pointer. What if the tree has more than one 
node? 


If the node we would have plugged the new object into isn't NULL, then we 
have another node to check. What we then have to do is see if the new 
name is less than the name for this node. If it is, then we plug it into the 
left node, and if it isn't then we plug it into the right node. But this is 
exactly what we did with the root node. This suggests that a general 
method for inserting a new node can be implemented as a recursive 
function. Let's look at the code: 


/* Function to insert a new node in the tree NC 


void InsertNode(struct Phone *pNew, struct Phone *pNode) 


if (strcmp(pNew-»pName,pNode-»pName)«0) 
if (pNode-»pLeftzzNULL) 
{ a 
pNode->pLeft=pNew; 
return; ur 
) 
else p 
InsertNode (pNew,pNode-»pLeft); 
) a 
else 
t 
if(pNode-»pRightzzNULL) 
pNode-»pRightspNew; 
return; | 
} 
else . 
InsertNode (pNew, pNode->pRight) ; 
} | | 
return; 
) 


It turns out to be very simple. The function will be called with a pointer to 
the new Phone object, and a pointer to the root node of the tree as 
arguments. If the name member of the new object is less than the name 
field of the current node, then we check whether the pointer pLeft is NULL. 
If it is, then we plug in the address of the new object, and we're done. If it 
isn't then we call the function InsertNode() with the new object as the 
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first argument and the node pointed to by pLeft as the second. The process 
if the name isn't less than the name field for the current node is the same. 
The function will call itself until a suitable pLeft or pRight pointer is 
found that's NULL, whereupon the new object will be inserted and the whole 
process will unwind. 


To complete the process for creating a tree, all we need is the 
CreateTree() function. This needs to be able to create a tree with one 
node using the first Phone object, and then insert all the additional objects 
into it: 





The whiie loop continues as long as the pointer returned by the function 
GetPhone() isn't NULL. The only loop action is the call InsertNode() for 
each new object. As soon as a NULL is returned by GetPhone() the tree is 
finished. 


Searching the Tree 


We want to be able to search the tree for a name and get back the 
telephone number for that name. The function to do this will need the 
name to be found and the pointer to the root node of the tree as 
arguments. We can get the function to display the number, so we don't 
need a return value. The prototype will therefore be: 





How is the function ShowNumber() to find a particular name in the tree? 
Our experience with constructing the tree is a good indicator of an 
approach. For any node, if the name equals the name member for the 
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current node, then we’ve found it so we can display the number and we’re 
done. If the name is less than the name for the current node, we test the 
node pointed to by pLeft, and if not, we test the node pointed to by 
pRight. If we arrive at a pLeft or pRight pointer thats NULL, then the 
name isn't in the tree. We can implement this as another recursive function: 


/* Function to find a given name and display the number */ 
void ShowNumber (char *pName, struct Phone *pNode) 


{ 

int Test; | | /* Value from string comparison */ 

if (pNode==NULL) /* If this node is NULL then */ 

{ | /* is not in the tree */ 
printf("The name *s was not found.\n\n", pName); : 
return; | 

} E 

Testestromp(pName,pNode-»pName); ^ /* Compare name to current node */ 

if (Test<0) | Au oO /* If it is less */ — | 

( | t |». 4* try the left node */ 
ShowNumber (pName, pNode-»pLeft) ; 2 | | E 
return; i ur : 

} UV | | 

Lf (Test>0) Er /* If it is greater */ 

t | | /* try the right node */ 
ShowNumber (pName,pNode-»pRight); 
return; 

) 

/* We have got it-so flaunt it*/ 2 oo 

printf("The number for %s is %sinin",pName, pNode - »pNumber ) ; 

return; 

) 


The first action in the function is to check whether the second argument is 
NULL. If it is, then we've been passed a pLeft or pRight NULL pointer so 
the name isn't in the tree, and we display a message and return. Otherwise 
we compare the name sought with the name member of the current node 
using the library function stremp(). If the value returned is negative, then 
the name is less than the name for the current node, so we call 
ShowNumber() to check the node pointed to by preft. If the result of the 
comparison is positive then we search the right node. If the result of the 
comparison is neither less than nor greater than zero, that leaves only one 
possibility - it must be equal to zero, so we've found the name and we can 
display the number. 
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Displaying the Tree 


To display the entire tree in alphabetical order, we can again use a recursive 
function. For any node in the tree, the node pointed to by pLeft must be 
displayed before the current node because its name field will be less than 
the name for the current node. We set it up this way. It follows then that 
for any node we need to display the node pointed to by pLeft, then the 
current node, and then the node pointed to by pRight. Of course if pLeft 
or pRight are NULL then there's nothing to display. We can implement this 
process as the function. DisplayTree(): 





This is remarkably short for a function that is to display a tree of any size 
and shape. If the passed pointer pNode is NULL, then there's nothing to be 
done and the function returns. Otherwise it calls itself with the pLeft 
pointer as the argument, displays the current node, and then calls itself with 
the pRight pointer as an argument. The whole process is kicked off by 
calling the function with a pointer to the root node as the argument. Easy, 
isn't it? 


Deleting the Tree 


With the amazing success of recursion in the other functions, this must be 
another opportunity to apply it, and indeed it is. Before you can delete any 
node, we must first delete the left and the right node. All we need is a 
recursive function that does exactly that: 
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(.— Àf(pNode-»pLeft I sNULL) Su /* If there is a left node */ 
d ateTras (pNode-»pLeft); eee /* then delete it */ 





okies phigh tunes) n /* If there is a right node */ 
M meee |.  /* then delete it */ 





treet ivde-^pName); | | /* Delete name for this node */ 
/— free(piode-»piunber); : /* Delete number for this node */ 
oo : "wo /* Now we can delete this node */ 


If the pointer passed is NULL then there's nothing to do, so we return from 
the function. Otherwise, the function calls itself to delete the left node, and 
then calls itself again to delete the right node, and finally deletes the 
current node after first removing its name and number. That's it. All we 
need now is the function main() to tie everything together. 


The Rest of the Program 


e EX6-04.C Btoring phone number using a binary tree */ 


include <stdio.h> | /* For input and output */ 
include <stdlib.h> _ |  4/* For malloc() */ 

include <string.h> B : /* For string functions */ 
Metina maanen ao /* Maximum input length */ 





Pi Structure. definitions "o 
ore mano E 










* Pointer to a name */ 

Pointer to telephone number */ 
Pointer to Phone object<current */ 
Pointer to Phone object>=current */ 


void | pierres ste *pName, struct Phone *pRoot); 
void DeleteTree (struct Phone *pRoot); 
ee Phone ree Tbe void) HF 
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Program Analysis 


The function main() is quite straightforward as virtually all the work is 
done in the other functions. The address of the tree root returned by the 
function CreateTree() is stored in pRoot. After a prompt, the while loop 
will search for successive names, and when an empty name is entered the 
process ceases. The complete tree is then displayed by calling 
DisplayTree() and finally the memory occupied on the heap is freed by 
calling DeleteTree(). 


Unions 


A union is like a structure, but its members all occupy the same memory 
area. A union is defined using the keyword union, in a statement with 
syntax similar to that for a structure. To define a template for a union type 
Shared, which can contain a double value, a long value, and a pointer, you 
could use the statement: 








As with a structure definition, this is a template, and doesn't define any 
variables. The size of a union is the size of its largest member, so in this 
case its size will be that of the double member, value. We can define a 
variable of type Shared with the statement: 





This declares the variable MyData which can contain values for any of the 
three variables value, Number or pName, but since they all occupy the same 
memory area, only one can be active at a single time. Referencing a 
member of a union is exactly the same as referring to a member of a 
structure. To set the member Number in the union MyData, you would use 
the statement: 





If you now attempted to use this value as MyData.pName or MyData.Value, 
you would obviously be working with garbage values, since you would be 
interpreting an integer 99 as a pointer, or as part of a floating point 
number. 


A union can also be a member of a structure, and vice versa. You access 
the members of a union that are members of a structure in the same way 
that we accessed members of a structure of type Point nested in a 
structure of type Line. 


Applications of Unions 


In the days of very limited memory capacity, unions were often used to 
save memory space - this is rarely the case today. One thing a union can 
help you with which is hard to do by any other means, is to treat the same 
data in two different ways, or to manage a collection of data values of 
different types as an array of bytes. If you have a data value of type 1ong 
passed to a function, that sometimes contains a single 4-byte long value, 
and at other times contains a pair of 2-byte int values, you can define a 
union: 
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Now you can refer to the same memory as two values of type int, 
MyData.jValue[0] and MyData.jValue[1], or as a single long value 
MyData.iValue. 


The situation can arise where you have a collection of different kinds of 
data stored in different variables that you want to handle as a simple array 
of bytes, to write away to a disk file as a single binary record for example. 
You can create a mapping between the variables of different kinds and an 
array of bytes, by aggregating the variables in a structure, and then creating 
a union with the structure, and a suitably sized array of type char, as 
members. 


For example, if we wanted to be able to treat an object of type Planet, the 
first example of a structure we saw, interchangeably as an array of bytes, 
we could define a union as: 





By copying a Planet object to the union with a statement such as: 





we could then move it around as a byte stream using Mapping.Array[]. 
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Summary 


We've now covered all of the ways for handling data in C. Structures are a 
very powerful mechanism for managing data, providing the foundation for a 
vast variety of techniques. The discussion on trees here barely scratched the 
surface. Although there are many more possibilities for different kinds of 
tree structures beyond what we have discussed, you should now have 
sufficient understanding of the basics to appreciate how the more 
complicated trees work, and are applied, when you meet them. 


The essential points from this chapter are: 


MD A structure is an aggregate of several variables that can be of 
different types, grouped under a single name. Members of a 
structure can be of any type, except the same type as the structure 
itself, although they can include pointers to structures of the same 


type. 


MID The only operations you can perform on a structure as a whole, are 
to take its address, copy or assign it to a variable of the same 
structure type, or pass it as an argument to a function. 


MID A structure member is referenced using the member selection 
operator by combining the name of the structure variable and the 
member variable name. A structure member can be used in the same 
way as any other variable of the same type. 


@ Structure members can also be accessed through a pointer to a 
structure by using the indirect member selection operator, ->. 
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MID Structures containing pointers to structures can be used to implement 
many important data structures, such as linked lists, doubly linked 
lists, and trees. 


EHE A union is a collection of variables that are stored starting at the 
same address. Members of a union are referenced using the same 
mechanisms as those for a structure. 


Programming Exercises 


1 write a program using the Planet structure that constructs a tree of 
Planet objects, and then provides a search capability for a condition to 
be met for a given member, for example, to find all the Planet objects 
that have a mass greater than 1.5. (You'll need to provide a means of 
recording pointers to multiple objects in the tree.) 


2 Write a program to read in several lines of text, and then use a tree 
to store the words occurring in the text in alphabetical order. Record 
and display the frequency of occurrence of each word. Search the tree 
for the word with the maximum number of occurrences in the text. 


3 Define a structure to include a name, an address and a telephone 
number. Write a program to construct a binary tree in name order. 
Sort the structures by telephone number, and by city. (You can 
construct a new tree for each ordering of the data.) 


4 Write a program to calculate the maximum and average search lengths 
for a binary tree. Use the tree in the previous example as the base for 
doing this. (You will need to find ever left or right pointer that is 
NULL, and keep track of how many levels it took to reach each one.) 
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Using Libraries 





The standard libraries that support applications written in C are defined by 
an ANSI standard, so you will find the same sets of functions available 
with any ANSI-compliant C compiler. By the end of this chapter you will 
understand: 


What groups of functions are provided by the standard library. 


How you can use standard library functions for classifying 
characters. 


How you can obtain the date and the time in your programs. 


How you can convert numeric values expressed as a character string 
to their numeric equivalents. 


How to generate pseudo random numbers. 


How to use string handling and searching functions. 


What mathematical functions are available in the standard library. 
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The Standard Library 


The standard library provides functions, type definitions, constant definitions 
of various kinds, and macros that you can use in your programs (we will 
discuss what macros are in Chapter 9). The contents of the standard library 
are sub-divided into 15 groups of facilities. The declarations necessary to 
use each group are defined in a standard header file, so there are 15 
standard headers too: 


ASSERT.H Debugging support 

CTYPE.H Tests for character types 

ERRNO.H Defines symbols corresponding to error codes 
FLOAT.H Parameters for floating point routines 
LIMITS.H Upper and lower limits for integer types. 
LOCALE.H Specific country and language support 
MATH.H Mathematical functions 

SETJMP.H Non-local branching support 

SIGNAL.H Errors and other exception handling 
STDARG.H Variable argument lists 

STDDEF.H Defines standard data types and macros 
STDIO.H Input and output 

STDLIB.H Utility functions including dynamic memory allocation 
STRING.H String processing functions 

TIME.H Date and time functions 


Including a Standard Library 


In order to use the contents of any particular group of facilities, you must 
incorporate the appropriate header file into your program using an 
Kinclude command. For example, we've already used this library in all the 
examples we have seen so far: 





The #include command must appear at global scope, and always prior to 
using any of the facilities it supports. 


It would take a whole book to cover all the functions and facilities 


provided by the standard library in detail, so we'll have to be selective. The 
file input/output functions supported by stTDIO.H will be discussed in 
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Chapter 8, and the keyboard and screen operations are summarized in 
Appendix A. We will look in Chapter 9 at the facilities defined in 
ASSERT.H, when we will be looking at debugging and the preprocessor. 


Character Classification Functions 


You will often need to test a character to see if it's of a particular 
classification - a digit, or a lower case letter for instance. The CTYPE.H 
header file supports a range of functions that all accept an argument of 
type int, and return a 1 if the argument corresponds to the type of 
character sought. They return 0 otherwise. The functions provided are as 


follows: 
islower () Tests for lower case. 
isupper () Tests for upper case. 


isdigit () Tests for a decimal digit, 0 to 9. 
isxdigit() Tests for a hexadecimal digit, 0 to 9, A to F (or a to f). 


isalpha() Tests for a letter, either upper case or lower case. 

isalnum() Tests for an upper case or lower case letter, or a digit. 

iscntrl() Tests for a control character. 

isprint() Tests for a character that prints including space. 

isgraph() Tests for a character that prints excluding space. 

ispunct() Tests for a character that prints excluding space, letters and 
digits. 

isspace() Tests for a whitespace character. 


In addition, two functions are provided to convert the case of letters. The 
function tolower() returns the lower case equivalent of its argument, and 
toupper() returns the upper case equivalent. 


Time and Date Functions 


Sooner or later you will need to measure time in a program. The standard 
library includes functions which can enable you to work with the time 
and the date. They provide output in various forms, generated from the 
hardware clock or clocks in your computer. To use them you must include 
the header file TIME.H in your program. Let's take a look at a few of 
them. 
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Getting Processor Time 


Perhaps the simplest function in this area has the prototype: 





This function returns the processor time (not the elapsed time) that your 
program has used since it began execution. The processor time is provided 
as a value of type clock t, which is defined in TIME.H and usually 
equivalent to type 1ong. The value is measured in clock ticks, units 
dependent upon your hardware clock. To convert the value returned by 
the function elock() to seconds, you must divide it by the constant 
CLOCKS PER SEC also defined in the TIME.H library The function clock() 
returns a value of -1 if an error occurs. 


Getting the Time and Date 


The function time() returns the calendar time from a fixed reference date, 
and time as a value of type time t, also defined in TIME.H, and equivalent 
to type long. The prototype of the function time() is: 





If the argument isn’t NULL, then the current calendar time is also stored in 
the location pointed to by the argument. However, the function is most 
commonly used with a NULL argument. You can convert the value 
returned to a structure containing day and date information in local time, 
by using the library function 1ocaltime(), which has the prototype: 





This function returns a pointer to a structure containing members with 
values detailing the time and the date. The structure type tm is defined in 
the TrME.H header file as: 
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| int tm year) — — 0. 4* Number of years since 1900 */ 

^J int non | |00 7 ^ c DON of days since Sunday "/ — 
= 4* Number of days since January 1 */ ——— 
mm bie m time p MW. i 





js tm edet, 





OMS 


So to obtain data on the current time and date, you could use the 
statements: 


time t MyTime D Store for t time value su | 






This passes the address of MyTime, which stores the value returned from 
the function time() as an argument to the function 1ocaltime(). So with 
pNow pointing to the tm structure returned from the localtime() function, 
you can obtain the current time by accessing the members pNow-»tm hour, 
pNow-»tm min, and pNow-»tm sec. 


Formatting the Time and Date 


If you want to convert the time and date to a character string for output 
purposes, the standard library provides a fancy function to do this for 
you. Its prototype is: 


size t t strftime(char *pTimeStr, size t MaxChars, | | 
| "s const char *pFormatStr, const pines tm *pTime) ; 


The first argument, pTimestr points to the string where the output from 
the function will be stored. The second argument is the maximum number 
of output characters, usually specified as sizeof(pTimestr). The 
pFormatStr argument specifies how the output is to appear in pTimeStr. 
This works in a manner similar to that used for the format string to the 
function printf(). Format specifiers are used to indicate which fields from 
the structure pointed to by the fourth argument, pTime, are to appear 
where. The format specifiers you can use here are: 


%a Abbreviated weekday name 
%A Full weekday name 

%b Abbreviated month name 
%B Full month name 
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%C 
%d 
%H 
%I 


%j 
%m 
%M 
%p 
%S 
%U 
%w 


%W 
%x 
%X 
ey 
%Y 
%Z 


3696 


Local date and time representation 

Day of the month number (01-31) 

Hour in 24-hour format (00-23) 

Hour in 12-hour format (01-12) 

Day number in the year (001-366) 

Month number in the year (01-12) 

Minute as a decimal number (00-59) 

Local AM or PM indicator for a 12-hour clock 

Second as a decimal number (00-59) 

Week number, with Sunday as the first day of the week (00-51) 
Weekday number (0-6; Sunday is 0) 

Week number, with Monday as the first day of the week (00-51) 
Local date representation 

Local time representation 

Year number without the century (00-99) 

Year number with the century 

Time-zone name or abbreviation; no characters if time-zone is 
unknown 

Percentage sign 


Displaying the Day and the Date 


For example we can use some of these in a program like: 
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Program Analysis 


This example will produce output similar to: 
| used the time functions on Tuesday September 15th, 1995 at 1153 AM 


but unless there is something really weird happening, you'll get a 
different time and date. 


The process is very straightforward. We obtain the current time (in 
seconds) using the function time(), and store it in the variable MyTime: 





The address of this variable is passed on to the function localtime() to 
obtain a structure containing the time and date data: 





t g TE penn CUNT Mis 


We pass the pointer to this structure to the function strftime(): 





for it to generate formatted output in the array Buffer[]. As you see, 
each format specifier selects a particular member of the structure pointed 
to by pNow, and inserts it in the output. 


Calculating Elapsed Time 


You can also use the output from the function time() to calculate elapsed 
time. You can get the elapsed time in seconds between two successive 
time t values returned by time(), by using the function difftime(), 
which has the prototype: 





double difftime( time 





This function will return the value T2-T1 expressed in seconds as a value 
of type double. 
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We could define functions to log the elapsed time and the processor time 
used between successive calls, and exercise them in the following example: 
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CPU_Last=CPU_This; 
return CPU Used; 


) 
/* Only do this first time through */ 
CPU. Last»clock(); 
|»  xeturn 0.0; 
) 
long ElapsedTimer(void) 
static time t El Last» 0; /* Holds calendar time from last call */ 
time t El This = 0; | | : 
long Elapseds0L; 
Af(El Last) /* If its not the first time */ 
t | | 
El Thisstime(NULL); /* Get current time value */ 
/* Get elapsed clock time  */ 
Elapseds(long)difftime( El This, El Last ); 
El LastzsEl Thies; /* Save the current clock time */ 
return Elapsed; /* Return the time intervals */ 
) 
/* We do this only the first time around */ 
El Laststime (NULL); /* Initialize clock time */ 
| return OL; /* Return zero intervals */ 
) 


Program Analysis 


This example performs the rather trivial activity of doing the same 
multiplication 10 million times, and then repeats it another 5 million times 
for good measure. The functions CPU Timer() and ElapsedTimer() provide 
the CPU time used between calls, and the total elapsed time between calls, 
respectively. We call both functions at the beginning of the main() which 
sets a current value in the static variable in each function. 


The function CPU Timer() is called after executing the multiply operation in 
the for loop 10 million times: 


for(isz0L;1«10000000L;i**) 
x=3.4567 * 4.5678; /* Multiply 10m times */ 


CPUsCPU Timer(); /* Get time after 10m multiplies */ 
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You may want to reduce or increase this figure, depending on how much 
money you've invested in your computer. 


After completing the loop, the CPU Timer() function is called again to 
obtain the total CPU time consumed: 


CCRT NRE DINI e ftem AUN, MIDI LM toO 
and this is displayed by the printf() function: 


After another 5 million multiply operations in the next for loop, we display 
the total elapsed time. The output from this example on my computer is: 
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CPU time for ten million multiplies is 19.99 seconds 
Total elapsed time is 31 seconds 
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Of course the total elapsed time is down to all the instructions executed 
between calls, and not just the multiply operations. 
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String Handling Functions 


The declarations necessary to use the string handling functions in the 
standard library appear in the header file srRiNG.B. We have already used 
the functions strlen() to get the length of a string, strepy() to copy from 
one string to another, and strcmp() to compare two strings. There are some 
others in this category which you may find particularly helpful. 


Joining Strings 


To append one string on to the end of another, you can use the function 
strcat(). This has the prototype: 
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The function will copy the string pointed to by the second argument, on to 
the end of the first string pointed to. The return value is a pointer to the 
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Strings 





modified string. The original first string must be big enough to hold the 
extra characters, otherwise you'll be in deep water. 


Where you need to copy a specific number of characters from one string to 
another, you can use the function strncat(). This function has the 


prototype: 





This will copy up to MaxLen characters from the string pointed to by 
paddstr to the string pointed to by pstr, excluding the terminating *w0*. 
The resulting string will thus be up to MaxLen characters longer that the 
original string, pstr. This function can be very useful if you want to 
append a particular word from the middle of one string on to the end of 
another. 


Comparing Strings 


As well as the compare function that we've already seen, there is another 
function that can compare a specific number of characters: 





This will compare the string paddstr to, at most, MaxLen characters of 
pStr. 


Searching a String 


We can use the strncmp function to search for words in a string, as shown 
with this example: 





243 


Chapter 7 - Using Libraries 





Program Analysis 


The program prompts for two strings to be entered, the first is to be 
searched, and the second contains the string to be found within the first. 
The pointer to the string to be searched, pString, is moved through the 
string as the string Findstr is compared with the characters in pString, up 
to the length of Findstr. The loop continues as long as there are enough 
characters in pString for a comparison to be made. 


When Finástr is found, Count is incremented, and the pointer, pString, is 
incremented by the number of characters in FindStr. Each time Findstr 
isn't found, pString is incremented in order to move to the next position 
in the string. This mechanism is shown in this illustration for part of the 
sample input: 


244 





String 














Searching 
































pString 

Starts 

here 
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pString is incremented by 1 
in each position the substring 


ie net found pString is incremented 


by the length of the 
substring when it is 
found 


An example of output from the program is: 


Enter a string to be searched less than 120 characters: 
Smith, where Jones had had “had”, had had “had had". 


Enter a string to be found less than 80 characters: 
had 


In the string: 
Smith, where Jones had had "had", had had "had had". 
the string "had" was found 7 times. 


Searching for Characters 


You can also search a string to find how much of the string consists entirely 
of characters from a particular set. For example, you could search the string 
"1.2 1.3 coordinates” for characters from the set "0123456789." which 
is all the digits, a decimal point and a blank space. 
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The strspn Function 
This has the following prototype: 





This will return the number of characters from the beginning of pstring 
that consists entirely of characters that are from the set appearing in the 
string pCharSet. This mechanism is illustrated here: 


pString pString-- Length 





«——— —— 8 characters ———————» 


Length-strspn(pString, "0123456789 "); 


The value returned by the function will be 8, since the first eight characters 
of the string can be found amongst the characters “0123456789.", but the 
ninth cannot. To find the first non-numeric character, you just need to add 
the value returned to the address of the first character in the string. 


The strcspn Function 


On the other hand you might want to ask the question in another way - 
how much of the string, starting from the beginning, consists entirely of 
characters that aren't alphabetic? The function to make this search has the 


prototype: 





To use this function to perform a search for what length of a string doesn't 
contain decimal digits, you would write: 
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The variable length will contain the count of the number of sequential 
characters from the beginning of pstring, which aren’t decimal digits. Thus 
the first decimal digit will be at position pString+Length. This mechanism 
is shown here: 


pString pString+Length 








cM ———————— 13 characters ————————————» 


Length=strcspn(pString,"0123456789-+"); 


The first 13 characters don’t contain any of the characters from the string 
specified as the second argument to the function strespn(), so this value is 
returned. This example is spacing over non-numeric data. The first 
occurrence of any character from the defined set is indicated by 
pString+Length. 


The strpbrk Function 


Another useful function will directly find the position in a string where the 
first character from a particular set can be found. It has the prototype: 





To find the first position in a string that contains a sign, or a decimal digit, 
you would write: 





the position in the string where the first occurrence is found is stored in 
pPos, which is of type ‘pointer to char’. If none of the characters specified 
by the second argument are found, a NULL is returned. 


247 


Chapter 7 - Using Libraries 


Analyzing a String 


. A. You will often come across a situation where you want to analyze a string - 
usually keyboard input, consisting of a number of substrings separated by 
some given character. Such substrings are usually referred to as tokens, and 
the characters used to separate them are called delimiters. The process of 
separating a string into tokens is referred to as parsing the string. 


The strtok Function 


You don't necessarily know how many numerical substrings there are, or 
precisely what they contain. All you know is that they're separated by 
commas, and that the numerical substrings won't contain a comma. You 
could write your own routine to do the analysis, but there is an easier 
method; the standard library provides a function to extract tokens from a 
string separated by given delimiters. It has the prototype: 





The first parameter is a pointer to the string that you want to analyze, and 
the second argument is a pointer to a constant string containing the 
delimiters. The address of a token in the string is returned by the function. 


However, it’s not quite as straightforward as that, since a string will 
typically contain several tokens, but this function provides you with a 
mechanism for getting at all of them. It will be easier to understand how 
the function works by looking at an example, so let's assume that we want 
to analyze a string containing numerical values separated by commas. 
Suppose that we have a string defined as: 





The first argument is the address of the string to be analyzed, and the 
second is a string containing the delimiter, which is a comma. Although 
we've only specified a single delimiter, there could be more than one. The 
variable pToken is of type ‘pointer to char’. The function will search from 
the beginning of the string for the first occurrence of the delimiter, and will 
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replace it with the string terminating character, 10”. It will finally return 
the address of the first character of the first token, in this case 
corresponding to the first character in the string, and this will be stored in 
pToken. We can then process the token however we like. 


To get the next token in the string, we call strtok() again, but this time 
with the first argument as NULL. We will assume that we can reuse the 
variable pToken to store the address of the next delimiter, so the statement 
to do this will be: 


pToken = strtok( NULL, ","); 


The function will find the next token, and after replacing the next delimiter 
with “10”, will return its address. By repeatedly calling strtok() with the 
first argument as NULL, all the tokens in the original string can be found. 
When there are no more tokens in the string, the function strtok() will 
return NULL, so you can end the process of searching for tokens by testing 
the returned value. For example, if we wanted to list all the tokens in our 
sample string on separate lines, we could do this with the statements: 


pToken = strtok( &String, ", "H : 

printf ("\n%s", €-hi  — A  . 

aiie (pToken = strtok( NULL, mm) ¡NULL ) — 
 printf("Wn*s", pToken); o oc 





We have used the same delimiter string for each call of strtok() in 
analyzing the string. This is the norm, but the function does allow you to 
use different delimiter strings on successive calls if you need to. There may 
be times when you would like to split your parsing into functions, and in 
these cases you must be very careful not to nest calls to strtok(), since 
this will confuse your compiler. 


The ‘Mem’ Functions 


There are a group of functions, defined in STRING.H, which are useful for 
manipulating arrays of bytes, and aren't limited to null-terminated strings. 
All the names of these routines begin with ‘mem’ (for ‘memory’), and the 
most commonly used is memset(), which is used to fill an area of memory 
with a particular value, and which returns a pointer to the newly-filled 
memory. This function has the prototype: 


void * memset(void *buff, int n, size t num); 
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Note the use of void pointers. We need to be able to pass any sort of 
pointer to memset and return it. A void pointer is the only way we can do 
this without causing errors. 


If, for example, we had a large character array which we wanted to 
completely fill with dashes, we could: 












Two things to watch for - make sure that you specify the 
correct length, or memset will write off the end of the string, 
and secondly, remember to put a null on the end of the string 
before you use it! 


Two other routines, memcpy() and memmove(), both copy a number of bytes 
from one buffer to another, but for memcpy the buffers mustn't overlap, 
while for memmove they can. 





String Conversion Functions 


These functions, defined in header file sTDLIB.H, will convert numbers 
represented as a character string into their numeric form. The need for this 
might arise if you read input as a character string, and you want to sort 
out its contents for yourself. There are two basic functions: strtol() which 
converts a character string to long, and strtod() which converts a 
character string to double. 


Converting a String to an Integer 


The prototype of the function to convert a string to an integer is: 
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String 





Conversion 





This function converts characters from a string to a long value until it finds 
a character that isn’t part of the character sequence defining the value. The 
first parameter defines the string to be converted, whilst the second 
parameter, pEnd is a pointer to a pointer, and will point to the character 
that stopped the processing of the string. The third parameter specifies the 
number base used to represent the number in the string. Base can have any 
value between 2 and 36, although you will rarely need to deal with 
numbers to base 36. 





Base 36 is the maximum mainly because the digits can be 0 
to 9 plus A to Z, so beyond 36 we don’t have symbols to 
represent the digits. 











If Base is set to zero, then the first couple of characters in the input string 
determine the base that is assumed. This works as follows: 


First Character Second Character Number Base 
0 1 to 7 Octal 

0 x or X Hexadecimal 
1 to 9 Decimal 


The number in the string can be preceded by whitespace characters which 
are ignored. A plus or minus sign may also be present. For octal numbers 
the first digit character should be 0, and for hexadecimal numbers the digits 
characters should start with ox or Ox. So +1239, and -0X3ABC are valid 
strings. Of course, you mustn’t have more digits than you can actually store 
as a long integer. 


If you just want to convert a string to an int decimal value, and you don't 
need to know where it ends in the string, you can use the function atoi() 
which has the prototype: 





This simply returns the converted decimal value as type int. For type long, 
the function atol() is also provided. 
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Be aware of something that can affect atoi, atol and atof 
(mentioned in the next section). If the conversion fails for any 
reason, a value of zero is returned. This horrible design flaw 
means that it is impossible to differentiate between an error 


and a valid value of zero using the standard library routines. 
The only reliable way round this is to write your own versions 
of these routines, which is harder than it sounds, but a very 
good exercise! 





Converting a String to Floating Point 
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The general library function to convert a string to a value of type double 
has the prototype: 





The parameters here have the same significance as the corresponding 
parameters for strtol(). The value in the string can be preceded by 
whitespace, and is otherwise represented as a floating point constant or a 
decimal integer. We can represent the general form as: 


[+ or -] [digit string] [decimal point] [digit string] 
[e or E] [+ or -] [digit string] 


where the square brackets indicate that the item is optional. The following 
are all valid floating point values: 


234 -1.234 .025E-3 34e10 3.45-2 


A simple conversion to double without determining where the value ends 
in the string is provided by the function: 





String 





Conversion 





Exercising Conversions 


In order to demonstrate such conversion routines, here is a sample program: 


/* EX7-04.C a program to exercise conversion routines */ | 
include <stdio.h> 7 ss ¢* For input and output */ 
#include <stdlib.h> ./* For conversion routines wd 


int main(void) 


{ | * | 
char Buffer[100]; i 2 pe Input tastier. ^ s 
char *pChar=NULL; ; /* Pointer to (455 A in Buffer M 
long aLongi*0L, aLong2=0L; | /* Long values */ | 
double aDoublelx=0. 0, aDouble2=0. 0; /* Double values y 
printf ("inEnter two floating sales values, a decimal integer, 

and a hexadecimal value\n"); — 
gets (Buffer); | : /* Get the tiput string “7 
aDoublel=strtod (Buffer, &pChar); * | /* Read double value */ 
aDouble2=strtod(pChar, &pChar); |  4* Read double value */ 
aLonglsstrtol(pChar, £pChar, 10); IA integer value e). 
aLong2=strtol(pChar, &pChar,16); | |o 2 4*9 Reed hexadecimal value “7 
printf("\nThe converted values are %f %£ *ld *1d", "o | 
aDoublel, s0093, Senate alonga); 

return 0; 

} 


Program Analysis 


The program reads a string into the Buffer[] array. The first double value 
is read by calling strtod(), with Buffer and the address of the pointer 
pChar as arguments. To obtain the second double value, the pointer pChar 
is used to specify the start point for conversion, and the address of pChar 
is again used as the second argument to the function. 


The function strtol() is used to convert the integer and hexadecimal 
value. In each case the address stored in pChar by the previous function 
call is used to specify the start of the string to be converted. Typical output 
from this program is: 
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Enter two floating point values, a decimal integer, and a hexadecimal value 
123.4 -3.234E-1 7696 OX3bf 


The converted values are 123.400000 -0.323400 7696 959 


Try running it with incorrect values to see how the functions cope. 


Converting a Number to a String 


We've covered how to convert strings to numbers, but what about going in 
the opposite direction, and converting numbers to strings? You'll find, if you 
look, that there aren't any library functions to perform this task, and the 
most common way to do it is to use a cousin of printf, called sprintf. 


Like printf, sprintf takes a format string and some optional arguments, 
and formats the arguments according to the instructions. Unlike printf, it 
then saves the output in a string rather than outputting it to the screen. 
Here's an example of how sprintf works: 








Mathematical Functions 


All the functions in this group return values of type double, and are 
supported by the header file MATH.H. The following trigonometric functions 
are provided: 


sin(x) sine of x asin(x) inverse sine of x 
cos (x) cosine of x acos (x) inverse cosine of x 
tan (x) tangent of x atan (x) inverse tangent of x 
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sinh(x) hyperbolic sine of x atan2(y,x) inverse tangent of y/x 
cosh (x) hyperbolic cosine of x tanh (x) hyperbolic tangent of x 


All angles as arguments or return values are in radians. 


We've already used the functions fabs() (for obtaining the absolute value 
of its argument), and sqrt() which calculates a square root. You can 
calculate a logarithm to base 10 with the function 10g10() and a natural 
logarithm with the function 1og(), and to evaluate e", the function exp() is 
available. The other functions provided through MATH.H are: 


frexp(x, int *exp) Converts x to a value m2", returning 'm' 
and storing 'n' in *exp. The value 
returned will be fractional but not less 
than 0.5. 


modf(x, double *ipart) Stores the integral part of x in *ipart, 
and returns the fractional part of x. 


ldexp (x,n) Retums x multiplied by 2". 


ceil(x) Returns the smallest floating point 
integer that isn't less than x. 


floor(x) Returns the largest floating point integer 
that isn't greater than x. 


pow(x,y) Returns x’. 


fmodf (x, y) Return the floating point remainder 
when x is divided by y. 


If you use values as arguments to mathematical functions outside the range 
permitted for the function, you'll get a domain error. This is recorded by 
storing a value in a standard variable errno of type int, which is defined 
in ERRNO.H. It will be set to the value defined by the standard symbol EDOM 
when a domain error occurs. This is also defined in ERRNO.H. With some 
functions, tan() or pow() for example, it is possible that results can exceed 
the maximum that can be stored in a double variable. In this case errno 
will be set to the value defined by the standard symbol, ERANGE. If you 
want to be sure your results are correct, you should check errno after 
using such functions. 
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Random Number Generation 


Another requirement you will run up against sooner or later is the need to 
generate pseudo-random numbers. For example, if you want to write a 
game, then a random number generation capability is usually necessary. The 
standard library includes two functions in header file STDLIB.H which are 
related to pseudo-random number generation, namely rand() and srand(). 


A pseudo-random number is a number generated by computer 


in a deterministic manner. They appear to be truly random but 
are repeated over a certain period. 





The srand Function 


The function srand() initializes the process of random number generation. 
Its prototype is: 


void  srand(unsigned int Seed); 


The value of Seed is used to start the process off. A given value for the 
parameter Seed will always produce the same sequence of pseudo-random 
numbers. If you want to get a different sequence each time, then you can 
initialize the process using the value returned by the library function 
time(). For example: 


This is a typical way of generating a pretty random number 
- seeding the generator with the current time (in seconds). 


Another popular seed value is the use of elapsed time before 
a keypress, although such times are hard to determine on 
different platforms. 





The rand Function 


The pseudo-random numbers are actually produced by the rand() function. 
The prototype for the function rand() is: 
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int rand(void); 


This will return an integer value from zero to RAND Max. The value of 
RAND MAX is implementation dependent, but it will be at least 32,767. 


A Random Program 


To demonstrate how we can use such functions, here is a suitable program: 


./* EX7-05.C Using pseudo-random numbers */ 


#include <stdio.h> /* For input and output */ 
#include <stdlib.h> /* For random number generation */ 
#include <time.h> | /* For the time() function */ 
int main(void) 
ave: | 
char *pPrize[]= 
{"first prize- a gift certificate for an Edsel service.", 
"second prize - a first edition of ‘The Paper Clip Users Guide’. 
"third prize - an inflatable watch.", 
"fourth prize - a self teach guide to sword swallowing.", 
"fifth prize - free entry to the world's strongest man competition.", 
"sixth prize - a waterproof ink eraser."; 
srand( (unsigned) time (NULL) ) ; 
| printf ("\nYou have won %s", 
pPrize[rand()%(sizeof pPrize/sizeof pPrize[0])]); 
return 0; | 


Program Analysis 


This program should produce random output from the set of available 
messages. Since there are only two statements that do anything, this will not 
take long to explain. The array of pointer pPrize[] point to the set of 
initializing strings. The call to srand() initializes the random number 
generator using the current value returned from the library function time(). 
One of the strings pointed to by pPrize[] is selected by the index value 
based on the value returned from the function rand(). The remainder after 
dividing the value returned by rand() by the number of elements in the 
array pPrize[] is used as the index. This value will be between 0 and the 
maximum legal index for the array. You can add further initializing strings 
and the program will adapt automatically. 
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Summary 


We have only just penetrated the surface of the standard library here. There 
are many more functions available, so you will be well repaid if you spend 
a little time looking into what there is on your system. Of course, in this 
chapter we have only been discussing those defined in the ANSI standard. 
Just about any C compiler system will contain others which are not within 
the standard. These will certainly provide a broader range of capabilities, 
and in some instances will provide functions which overlap with those in 
the standard library, but which are superior or easier to use. 


The libraries are an essential element in C programming. Once you have 
mastered the language, you need to take some time to familiarize yourself 
with all the functions that seem relevant to your interests. Remember, if you 
are likely to be moving your program between different systems, it is 
important to limit yourself to the functions in the standard library to avoid 
unnecessary difficulties in porting your program from one machine to 
another. 


Programming Exercises 
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1 write a program to convert an input string containing mixed integer 
and floating point values in an arbitrary sequence. Display the values 
found. 


(Hint: you will need to find out where each number 
starts and ends, and then search for the occurrence of 
the distinguishing characters or a floating point value. 
A floating point should contain a decimal point, or an 
E or an e, or both.) 





2 Write a program that is the equivalent of printf() for integers and 
floating point values corresponding to *«d and %f. 


(Hint: You will need to support a variable number of 
arguments determined from the format specifiers in the 
format string. You can use the string search functions to 
find the format specifiers. Look for %, then look at what 
follows to decide what type the argument is.) 


3 Write a program to simulate a fruit machine. Display each row of 
symbols as words. Keep track of winnings and losses. The payout is 
up to you. 


4 Write and test a function to deal a hand of n cards at random. 
Display the hand as characters, for example: Club A, Diamond Q, 
Spade A, Spade 10, Spade J. 
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File Operations 





This chapter is all about reading and writing data to files in your 


program, which are usually hard disk files. By the end of this chapter you 
will have learnt: 


What a stream is. 

What the purpose of opening a file is, and how you do it. 
How you can read and write to a file one character at a time. 
What file buffering is, and how it works. 

How formatted input and output for a file works. 


How to read and write binary files. 


How to randomly access a file. 


What error processing functions are available for file operations. 


| 
| 
| 
| 
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The Concept of a File 
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You're probably familiar with the basic mechanics of how the hard disk 
on your computer works. If not, it would be a good idea to look into it 
as this can help you to recognize when a particular approach to file usage 
will be efficient, and when it may not be. There is nothing in the concept 
of file processing in C that depends on knowledge of any physical storage 
device. All the functions provided by the standard library are device 
independent. 


However, a disk drive in a particular operating system environment will 

have particular operating characteristics that can affect the performance of 
your programs, and the efficiency with which you use the available disk 

space. Therefore a knowledge of the characteristics of the disk storage on 
your computer will help you to avoid inefficient approaches. 


A file in C is visualized as a serial sequence of bytes: 


Beginning of File 





Current Position 


It has a beginning, an end, and a current position, the latter being 
typically defined as a particular number of bytes from the beginning. 


The current position is where any file action, a read or write, will take 
place. You can move the current position to any other point in the file. A 
new current position can be specified as a positive offset from the 
beginning of the file, a negative offset from the end of the file, or in some 
circumstances, as a positive or negative offset from the current position. 


Processing 





Files 





In C a file is referred to as a stream, a flow of data between your main 
computer memory and an external device. The data flow can be to or 
from memory and can be related to almost any external device including 
the keyboard and the screen. 


Processing Files 


When you write a program to process a file, you need a mechanism to 
associate the file operations that go on in your program, with the name of 
a particular physical file on the disk. This will allow your program to 
operate with different files at different times. 


Opening a File 


Before you can use any file it must be opened. This is true even of the 
standard streams stdin, stdout, and stderr, but these are automatically 
opened for you when your program is executed. 


The fopen Function 


You open a file using the function fopen() from the standard library, 
which returns a pointer to a structure of type FILE containing the data 
needed for a specific file. When you declare a pointer of this type, you 
don’t need to use the struct keyword because FILE has already been 
defined using a typedef in the header file sTDIO.H. 


Declaring fopen 


The function fopen() is also declared in the header file sTDro.nH along with 
all the other functions for operations with files. It has the prototype: 


FILE *fopen(const char *pName, const char* pMode); 


The first argument to the function fopen() is a pointer to a string 
containing the name of the file that you want to process. 
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The second argument to the function fopen() is a character string which 
specifies what you want to do with the file. This is called the file mode. 
As we shall soon see this spans a whole range of possibilities, but for the 
moment we shall just look at three, which nonetheless comprise the basic 
set of file operations: 


"w^ open a file for write operations 
"a" open a file for append operations 
“or open a file for read operations 


Note that these mode specifiers are character strings defined between 
double quotes, not single characters defined between single quotes. 


Calling fopen 


Assuming that the call to fopen() is successful, a pointer is returned that 
you can now use to reference the file in other input and output 
operations. As we saw earlier, the structure associated with a file pointer 
will contain information about the file that the functions supporting file 
operations need. This information includes the name of the file, the 
specified mode, and a pointer to the current position in the file. 


However, you don’t need to worry about the precise contents of this 
structure in practice, since it’s all taken care of by the input and output 
functions. Obviously, if you want to work with several files at once, they 
must each have their own file pointer variable declared, and they each 
need to be opened with a separate call to the function fopen(). 


Using fopen to Create Files 


If we suppose that we wanted to write to an existing file with the name 
MYFILE, we would use the statements: 
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The first statement is a declaration for our file pointer, which we've 
initialized to NULL to be on the safe side. The second statement opens the 
file and associates the physical file called MyFILE with our internal pointer 
pFile. Because we have specified the mode as “w”, we can only write to 
the file - we can't read from it. The file position will be set at the 
beginning of the file, so subsequent write operations will overwrite the 
file’s existing contents. 


If the file name MYFILE doesn't already exist, then the fopen() function 
will create it. When you just want to create a new file, simply call 
fopen() in mode “w”, with the first argument specifying the name that 
you want to call the file. 


Using fopen to Append to a File 


If you want to add to an existing file rather than overwriting it, you can 
specify the mode as "a", which is the append mode of operation. Opening 
the file in this mode positions the file at the end of the last piece of data. 
If the file specified doesn't exist, then a new file will be created as it was 
in the case of mode "w^", and since the new file is empty, write operations 
will start at the beginning of the file. 


Using fopen to Read From a File 


If we want to read a file, once we have declared our file pointer we 
would open it using the statement: 


Clearly, if we're going to read the file, it must already exist. 


If you inadvertently try to open a file for reading that doesn't exist, 
fopen() will return NULL. You should always check the return value to be 
sure that the file open command succeeded. As with the write mode, 
opening a file for reading sets the file position at the beginning of the 
data in the file. 
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Writing Characters to a File 


Once we have opened a file for writing we can write to it at any time 
from anywhere in our program, provided that we have access to the file 
pointer which has been set up by the function fopen(). So if you want to 
be able to access a file anywhere in a program containing multiple 
functions, you can declare the file pointer at global scope. As you will 
recall, this is achieved by placing the declaration outside all of the 
functions, usually at the beginning of the program code. 


The fputc Function 


The simplest write operation is provided by the function fputc(), which 
writes a single character to a file. It has the prototype: 


int fputc(int c, FILE *pFile); 


The first argument specifies the character to be written to the file as type 
int, and the second is the file pointer. The character is converted to type 
unsigned char and then written to the file as a single byte. 


If the write is successful, then it returns the character written. Otherwise it 
returns EOF, a special character called the end of file character which is 
defined in STDIO.H. The EorF character is guaranteed to be different from 
all the other standard ASCII characters, which is one reason why the first 
argument and the values returned are of type int rather than char. 


The standard library function putc() performs the same operation as 
fputc(), but is usually implemented as a macro. 


Buffering 
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In practice, characters aren't written to the physical file one by one - this 
would be very inefficient. Hidden from your program and managed by 
the output routine, output characters are written to an area of memory 
called a buffer, until it's full, and they're written to the file all in one go. 
The buffer size is usually based on the minimum block of data that can 
be written to the storage device: 


Buffering 





hanik IRR 
Output Buffer 


te} jejej [efri | 








Current Position 





Buffer is written to disk when full 
or when the file position is moved. 





File buffering is provided automatically, although operations with stdout 
aren't buffered. If you redirect a general file to the screen, buffering is 
usually inhibited. 


Reading Characters from a File 


Once a file has been created, we can read from it at any time from 
anywhere in our program, provided that we have access to the pointer for 
the file which has been set up by the function fopen(). 
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The fgetc Function 


The function fgetc() is complementary to the function fputc() and reads 
a character from a file which has previously been opened for reading, the 
file being specified by the file pointer passed as an argument. The character 
is read as a single byte of type unsigned char and is returned from the 
function as a value of type int if the read is successful. If the operation 
isn't successful then it returns EOF. Typical use of fgetc() can be illustrated 
by the statement: 





The variable MyChar is assumed to have been declared as type int. If you 
want to store the character read as char, you can cast the return value to 
that type, but the check for Eor must be with the original type int value. 


Behind the scenes, the actual mechanism for reading a file is the inverse 
of writing to a file. A whole block of characters is read into a buffer. The 
characters are then handed over to your program one at a time as you 
request them until the buffer is empty, whereupon another block is read. 
This makes the process relatively fast, since most fgetc() operations 
won't involve reading the disk, but simply move a character from the 
buffer in main memory to the place where you want to store it, although 
you do have the overhead of calling a function for each character you 
want to read. 


Remember that each file you create is a text file, and can be 
treated as you would any other text file. Viewing any files you 
create in a text editor or word processor is an excellent 


technique, if you're not convinced that you are reading the 
data files you created back in correctly. 





Pushing a Character Back 


The library function ungetc() enables you to return a character back to 
the stream, in effect 'unread' it: 








———— 


Closing 





a File 





The character supplied as the first argument, ch, will be returned to the 
buffer, and will be available to be read again on the next read operation. 
The character put back into the buffer will be returned from the function 
and if the operation fails, EOF is returned. The Eor character can't be 
returned to the buffer. 








Don’t try to push more than one character back onto the 
stream. 








The function ungetc() is very helpful when you're processing a variable 
stream of input, such as a character string of unknown length, 
immediately followed by some numeric data. You could read the input 
until you find a digit with a loop such as: 


while( (Inputsgetc(stdin)) != EOF && (lisdigit(Input)) ) 
/* Process the string here */ oe 


When the loop ends, the last character read is either EOF or a digit. If it 
isn’t an EOF character, then you may want to put the character back into 
the input stream in order to enable the numeric value to be processed as 
a separate and complete entity. You could use the ungetc() function to do 
this, as follows: 


if(Input != EOF) 
ungetc(Input, stdin) 


Closing a File 


When we've finished with a file, we need to tell the operating system so 
that the file can be released for other purposes, and our file pointer can be 
freed up too. This is referred to as closing a file. We do this through the 
function fclose() from the standard library, which accepts a file pointer as 
an argument. The function returns an int value which is set to EOF if an 
error occurs, and 0 if otherwise. Typical usage would be: 


fclose(pFile); /* Close the file associated with pFile */ 


After execution of this statement, the connection between the pointer 
pFile and the physical file name is broken, so pFile can no longer be 
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used to access the physical file it represented. If the file was being 
written, then the contents of the output buffer are written to the file 
before it's closed, ensuring that data isn't lost. 


Why Files Should be Closed 


It's good programming practice to close a file as soon as you've finished 
with it. If your program crashed, and you hadn't closed your files 
properly then you could lose the contents of the output buffer Another 
reason for closing files as soon as you've finished with them, is that the 
operating system may limit the number of files that you can have open at 
one time. The header file sTDrio.H defines the symbol rFOPEN Max as the 
maximum number of files you can have open at once. 


A Read/Write Example 


We now have enough knowledge of the file input and output capabilities 
of C to write a simple program to write and use a file. So let's do just 
that: 
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Read/Write 





Program Analysis 


This program provides an illustration of how you can write a file 
character by character. Before running this program, or indeed any of the 
examples working with files, make sure that you use a unique file name, 
in order to avoid overwriting an existing (and perhaps important) file. An 
example of the output is: 


Enter an interesting string of less than 80 characters. 
a man a plan a canal Panama 
amanaP lanac a nalp a nam a 


The call to fopen() creates a new file with the name MYFILE in the 
current directory, and opens it for writing. If you don’t wish to create the 
file in the current directory, then you can specify a full drive and path 
name, such aS C:\TMP\MYFILE. 


The if statement checks that we got a valid file pointer back from 
fopen(), and if the file couldn't be opened, then the program terminates: 





The loop which writes the string to the file counts backwards from the last 
character in the string to the first, and so the putc() function call within 
the loop writes the string to the new file character by character, and in 
reverse order. 
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Once the file has been written, it is closed and then reopened, this time in 
read mode, which sets the file position to the first character in the file. 
Again, we check to make sure that the open operation worked. 


We then read the file character by character in the while loop, the read 
operation actually taking place within the loop continuation condition and 
displaying each character as read: 





The process stops when EOF is returned by the function getc(), which 
will occur when we reach the end of the file. 


The last two statements in the program provide the necessary final tidying 
up now we have finished with the file. After closing the file, the program 
calls a new function from the standard library, remove(). This will delete 
the file with the name that is passed as the argument to the function. 
Deleting the file here will avoid cluttering up your disk with stray files. 


Writing a String to a File 


Analogous to the function puts() which we've used previously for writing 
a string to stdout, we have the function fputs() for writing a string to a 
file. Its prototype is: 


int fputs(char *pStr, FILE *pFile ); 





This accepts as arguments, a pointer to the character string to be output, 
and the file pointer. The function continues to write the string to the file 
until it reaches a 10* character, which it doesn't write to the file. For 
example, the statement: 





will output the string appearing as the first argument, to the file pointed 
to by pFile. 
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Reading a String from a File 


Complementing the function fputs() is the function fgets(), which reads | 
a string from a file. It has the prototype: | 





—— 


char *fgets(char *pStr, int nChars, FILE *pFile ); 


— 


This function differs from fputs() in that it has three parameters. The 

function will read a string from the file specified by pFile into the buffer 

pointed to by pstr, which can hold at least nchars characters. Characters | 
are read from the file until a newline character, ‘\n’, is read, or a 
maximum of nChars-1 characters have been read from the file. If a | 
newline character is read, then it’s retained in the string, and 10” is | 
appended in memory. If there is no error then the function will return the | 
pointer pstr, otherwise NULL is returned. | 


— 


-— 


— 


n Example | 


We could exercise the functions to transfer strings to and from a file in an | 
example which uses the append mode: 














— 


| 
| 
| 
i 
| 
x 
$ 





——— 
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Program Analysis 


In this example, the array of pointers pProverbs[] is initialized using 
three string constants, leaving the compiler to work out the array 
dimension. Each string has ‘\n’‘ as the last character so that the function 
fgets() will be able to recognize the end of each string. Of course, if 
they were of fixed length, then we could just use the length of the string 
to control how many characters were read. 


After creating and opening the file MYFILE for writing, each of the 


proverbs in the pProverbs[] array is written to the file using the function 
fputs(), in a for loop: 
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roverbs[01) ; i++) |. 





This function is extremely easy to use, just requiring a pointer to the string 
as the first argument, and a pointer to the file as the second. The number 
of iterations is calculated using: 





which will give us the number of elements in the pointer array. We could 
have manually counted how many initializing strings we supplied, but 
doing it this way means that the correct number of iterations is 
determined automatically. 


Once the first set of proverbs has been written, the file is closed, and then 
reopened in append mode, which causes the current position for reading 
and writing to be set to the end of the data in the file. 


With the file now open in append mode, we write the additional proverb 
to the file in the array More[], using fputs(). Since we are in append 
mode, the new proverb will be added after the existing data in the file. 
Having written the file, we close it once again and then reopen it for 
reading by using the mode specifier "r", and then read the strings 
successively into the array More[].Finally the file is closed, and then 
deleted. 


Formatted File Input and Output 


Writing files one character at a time isn't adequate for many purposes 
though. Even the ability added by the function fputs() to output a string 
doesn't solve the problem of us wanting to store away large chunks of 
data at a time. 


You're also likely to want to write data to a file as formatted text, derived 
from the numerical values in your program. With this sort of capability 
you'll be able to readily transfer information to other environments, such 
as your word processor for instance. The mechanism for doing this is 
provided by the standard library functions for formatted file input and 
output. 
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Formatted Output to a File 


The standard library function for formatted output to a file is fprintf(). 
It's almost the same as the print£() function, with one extra parameter 
and a slight name change. Its prototype is: 


int fprintf(FILE *pFile,const char *pFormat,...); 


The first parameter is a file pointer, and the remaining parameters are the 
same as those for printf() - a format string, followed by the variables to 
be written. The function returns a count of the number of characters 
written to the file, or a negative value if an error occurs. 


An fprintf Example 


The use of the function fprintf() is typified in the statement: 





The file pointer, pFile, must point to a file which has previously been 
opened in write or append mode. The values of the three variables Numi, 
Num2, and  Fnum, are written to the file, under the control of the format 
string specified as the second argument. Thus the first two variables are of 
type int and are to be output with a field width of 12, whilst the third 
variable is of type float, and is to be written to the file with a field 
width of 14. 


Formatted Input from a File 
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Formatted input from a file is accomplished using the function fscanf(). 
To read three variable values from a file pointed to by the pointer pFile 
you would write: 





This function works in exactly the same way as scanf() does with stdin, 
except that here we're obtaining input from a file specified by the first 
argument, pFile. The rules that govern the specification of the format 
string and the operation of the function are the same as those that apply 
to scan£(). If an error occurs, such as no input is read, then the function 
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returns EOF, otherwise it returns an int value specifying the number of 
values read. 








It's probably fairly obvious to you by now, but you need to 
be aware that a formatted write doesn't necessarily capture the 
precise value of a variable that is written. For example, if you 
write the value 1.23 with the format %.1£, you actually write 
1.2 to the file (ignoring any leading blanks), losing the 
possibly very important .03. 


There's also a potential problem when you read a file. A file 
is simply a string of characters. How these characters are 
interpreted is determined entirely by the format string used 
to read them, and has no connection with the original values 
that were written. 














—— — 





Should mistakes such as these be made, then it can be quite difficult to 
locate the source and solve the problem. 


An Example of Formatting Input and Output 


We could exercise the formatted input and output functions with an 
example which will also demonstrate how the data is subject to 
interpretation in these operations: 


/* EX8-04.C  Formatted werta and reading of a file */ 


#include «stdio.h» ge For Ne and ene M 
int main (void) uu i a 2 | 
( p c c I a a | 
| long Num12234567L, Num22345123L, Num32789234L; = 4/* Input visión "y 
long Num4=0L, Num5=0L, Mume - OG; /* Values read from the file */ 
float Fnum=0.0f; -o ~ /* Value read from the file */ | 
dnt ¡Val Elí 0). =. /* Values read from the file */ 
int i = 0; | = 4/* Loop counter */ 
FILE *pFile = NULL; AA File pointer */ 
pFile = fopen("MYFILE", "w" ); |" /* Create file to be written */ 
if(pFileszNULL) | | | | 4* Check for valid file pointer */ 
( : 


printf("\nOpen to create a new file failed."); 
return 1; » e 
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If you compile and run this program you should get the following output: 


234567 345123 789234 
234567 345123 789234 


iVal[0] = 23 iVal[1] 
iVal[4] = 37 iVal[5] 
Fnum = 234.000000 


456  iVa[2] = 734  iVal[3] = 512 
89 


This example writes the values of Numi, Num2, and Num3, to the file 
MYFILE. The file is closed and re-opened for reading, and the values are 
read from the file in the same format as they were written. The first two 
lines of the output demonstrate that the original data, and that read from 
the file, are the same. 


Operation 


Modes 





We then call the standard library function rewind() which simply moves 
the current position back to the beginning of the file so that we can read 
it again: 


NOPE Oren ew: 


We could have achieved the same thing by closing the file then re-opening 
it again, but rewind() is more efficient. 


Having repositioned the file, we read the file again with another call to 
fscanf(): 





this time reading the data into the array ival[] and the variable Fnum. 
This reads the same data as before, but with different formats from those 
used for writing the file. 


You can see in the program output, that the file consists of just a string of 
characters once it has been written, exactly the same as the output to the 
screen from printf(). You can also see that the values you get back from 
the file when you read it will depend on both the format string that you 
use, and the variable list you specify in fscanf(). 


Finally we leave everything clean, neat, and tidy by closing the file, and 
using the function remove() to delete it. 


Further File Operation Modes 


Thus far we've only processed files in text mode, where information is 
written as strings of ASCII characters. Text mode is generally the default 
mode of operation, but you can specify explicitly that file operations are 
in to be in text mode if you wish by adding a 't' at the end of the 
existing file mode specifiers. 


This gives us the mode specifiers of "wt", "rt", and "at". In some 


environments certain characters will be changed in text mode. Under MS- 
DOS for example on IBM compatible PCs, writing a newline character to 
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a file, causes two characters to be written, a carriage return character (CR), 
and a line feed character (LF). On reading the same file, the two 
characters will be recombined into a single character once more. This can 
cause problems with file position operations which we'll be looking at 
later in this chapter. 


Updating a File 


We can also open a file for update, which means that you can read and 
write to the file. For this mode you use the “r+” specifier. You can also 
specify update mode as “w+”. If you wanted the mode to be specified 
explicitly as a text operation, you add a 't' to the mode specifier, so 
update mode would become "r«t" or “rt+”. Either is perfectly acceptable. 
You could also use "wt«" or "wt". 


As we have said, in update mode you can both read and write to the file, 
but not one operation immediately following the other. The reason for this 
is because of the nature of the buffer used by functions. If you write to 
the file, then you're changing the contents of the buffer, not the physical 
disk. A subsequent read would transfer information from the physical disk 
into the buffer overwriting the change you have just made. 


To do a read followed by a write, or vice versa, you must make sure that 
the buffer is cleared to the physical disk. You can do this by performing a 
file position change such as rewind() or doing a fflush() on the file. 
The only exception to this rule is if an EOF was returned by the initial 
operation. 


Similarly, the first read from a file will fill a buffer area in memory, and 
subsequent reads will transfer data from the buffer until it's empty, 
whereupon another file read to fill the buffer will be initiated. So for a 
switch from read to write something must be performed to sort the buffer 
out from the previous operation. 


Flushing an Output Buffer 


As well as repositioning within the file, there's a library function 
fflush(), which causes the contents of a write buffer to be written to the 
file or flushed. It has the prototype: 
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int fflush(File *pFile); 


The single argument specifies the file to be flushed. If the argument is 
NULL, then all buffered output files are flushed. If the function fflush() is 
applied to an input file, then the effect is undefined. 


Unformatted File Input/Output 


The alternative to text mode operations is called binary mode, where 
there's no need for a format string to control input or output, making it 
much simpler than text mode. The binary data as it appears in memory is 
transferred directly to the file. Characters such as ‘\n’ and *\0’ which 
have specific significance in text mode are of no consequence in binary 
mode. As there is no data transformation, binary mode is somewhat faster 
than text mode. 


Specifying Binary Mode 


Binary mode is specified by appending a 'b' to the basic mode specifiers, 
giving us the additional specifiers "wb" to write to a binary file, “rb” to 
read from a binary file, “ab” to append data to the end of a binary file, 
and “rb+” to enable the reading and writing of a binary file. 


Since binary mode involves handling the data to be transferred to and 
from the file in a different way from the text mode, we have a new set of 
functions to perform input and output. 


Writing to a Binary File 


To write a binary file, you use the fwrite() function, best explained with 
an example. Assuming that we open the file to be written with the 
statement: 





 pFileefopen("MYFILE", "wb*"); | 





then we could write to the file with: 





wCount«fwrite(pData, Size, Mumitems, prile 
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This operates by writing a specified number of objects to a file, where 
each object is a given number of bytes long. The first argument, pData, is 
a pointer containing the starting memory address of the data objects to be 
written. The second argument, Size, specifies the size of each object to be 
written, whilst the third argument, NumItems, defines a count of the 
number of objects to be written to the file. The file is identified by the 
last argument, pFile, which is the file pointer. The function fwrite() 
returns the count of the number of items actually written. If the operation 
was unsuccessful, then this value will be less than Numrtems. 


The return value, and the arguments Size and NumItems, are all of the 
same type as that returned by the sizeof operator defined as size t, 
which is an unsigned integer. 


Let's assume that we want to write objects stored in an array Data[]. 
Without knowing anything about the type of the array, we can write the 
entire array to a file with the statement: 





The sizeof operator is used to specify the size in bytes of the objects to 
be transferred, as well as determining how many objects there are in the 
array. Using sizeof is the best way to define the size of an object to be 
written to a file, particularly when the object is a structure where it isn't 
always obvious how many bytes are involved. Of course, in a real context 
we should also check the return value in wCount, to be sure that the write 
was successful. 


Thus our function for binary writes to a file is geared to writing a number 
of objects of any length. You can write in units of your own structures as 
easily as you can write ints, doubles, or bytes. 


Reading from a Binary File 


To read a binary file once it has been opened in read mode, you use the 
function £read(). Using the same variables that we used in our example 
of writing a binary file, to read the file we would use a statement such 


as: 
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This operates as the exact inverse of the write operation. This function 
reads the number of objects determined by the expression sizeof (Data) / 
sizeof (Data[0]), each of size sizeof(Data[0]) bytes, into memory 
starting at the address of Data, and returning a count of the number of 
objects read. If there’s insufficient data in the file, or if an error occurs, 
then the count will be less than the number of objects requested. 


A Binary File Example 


We could apply binary file operations to the program that we saw in 
Chapter 4 that calculated prime numbers. This time we'll use a disk file 
as a buffer to enable the program to calculate a larger number of primes. 
As this program consists of several functions, let's first take a look at 
main(): 


/* EX8-04.C A prime example using binary files */ 


#include <stdio.h> | 4* For input and output 7 

#include <math.h > vr /* For square root function */ 

define ROW SIZE 5 | /* Number of primes output per line */ 
#define MEM PRIMES 100 |. .. Z* Count of number of primes in memory */ 


/* Function prototypes 7 | | 

int TestPrime (unsigned long Trial); | /* Test for primeness */ 
void PutBuffer (unsigned long *Primes, int Index); /*Write primes to file*/ 
sur Chook (unsigned long *Primes, int index, unsigned long Trial); 


chax *MyFÀles"MYPILE"; | 4* Physical file name */ 
FILE *pFile) — — sco 4* File pointer */ 
unsigned long Primes [MEM PRIMES]-( 2UL, 3UL, 5UL, OL- y; 
int index-3; =~ 4* Index of free location in memory T 
int nRec=0; B od e i dre du Number of file reooroa ae 
dnt main(void) | a | 
selló ion. Triale5UL; P /* First prime candidate */ 
long NumPrimess3L; 1 i 
long Totals0L; a /* Prime count, total required */ 
ian man; oda would you like? "); 
scanf ("%1d", ATI) Qu /* Read NOM many */ 
/* Prine finding and storing loop */ 
while (NumPrimes<Total) p £T Loop until we se. hd required */ 
Tríal*«-2UL; ze Je Next value for sheo ind */ 
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Program Analysis 


After the #include statements, we have the definition of Row srzE for the 
number of primes to be output on a line, and MEM PRIMES which is the 
maximum number of primes to be held in memory. Ideally, MEM PRIMES 
should be an integer multiple of Row sIZE, otherwise the output won't be 
quite so neat. 


Once the program has computed a MEM PRIMES number of primes, the 
primes in memory will be written to a disk file automatically. This will 
free up the array used to store them in memory so that it can be used to 
store the succeeding primes. If you request a number of primes less than 
MEM PRIMES then none will be written to disk. 


We have included the file pointer prile and the pointer MyFile as global 
variables to allow input and output operations from anywhere in the 
program. 


The task of checking for a prime is performed by the function 
TestPrime() which is called within the loop, and returns 1 if the value 
passed to it is prime, or 0 if otherwise. If a prime is found then we store 
it in the next available element in the Primes[] array, and increment the 
variable index to point to the next element. We keep track of how many 
primes we have in total with the variable NumPrimes. 


284 





Binary Files 





Each time that we find a new prime and add it to the Primes[] array, we 
need to check whether the array is full. If it is, then we need to display 
the primes in the array and write the array to the file. This is achieved 
with the statements: 


if (index==MEM_PRIMES) /* Check if memory is full */ 

{ /* It is so */ 
PutBuffer(Primes, index); /* Display the current block */ 
pFilesfopen(MyFile, "ab"); /* and write them away */ 
fwrite(Primes, sizeof(long), MEM PRIMES, pFile); 
fclose(pFile); /* Now close the file */ 
index=0; /* Reset count of primes in memory */ 
nRec++; /* Increment file record count */ 

} 


The primes are displayed by the function PutBuffer(). You may wonder 
why we need to keep opening and closing the file here, rather than 
opening it once and then writing to it whenever the need arises. 
Remember that once we have primes in the file, they will need to be read 
back for use as divisors in testing for a new prime. The easiest way to 
handle this is to open the file for writing here and then close it, opening 
it again for reading when we're testing a value, and closing it again when 
we're done. After writing the primes to the file, the variable index is reset 
to zero, and nRec, which counts the number of blocks of primes in the 
file, is incremented by 1. 


When sufficient primes have been found, the function PutBuffer() is 
called once more to display any still remaining in memory that weren't 
written to the file. 


Validating a Prime 
The function to check whether a value is prime is as follows: 


ffe ee ede e ehe ee e e ee ee ee he eoe ee e e ce e e ec e eee e e e e ee ee ee e eee e e e ee e ec e e e e e e e o x 


* Function to test for primeness using primes in memory and on file * 
* Returns a positive value for a prime found, zero otherwise * 
Ye e hee e e e e e ee eee ee e ee e e e e e e We e e e ee e e ee e e e e e e heec e eoe e e e e e e e e e e e e e e e en e e e n e x / 


int TestPrime(unsigned long N) 


{ 
unsigned long Buffer [MEM PRIMES]; /* local buffer for file data */ 
int i=0; /* Loop counter */ : 
int k=0; /* Return value from Check() */ 
if (nRec»0) 
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Function Analysis 


The function TestPrime() accepts a candidate value as an argument, and 
returns 1 if it’s prime, and 0 if it isn't. It uses the function Check() to test 
for exact division of the candidate by a block of primes passed to it as an 
argument. 


If the function Check() finds an exact division then it returns 0 indicating 
that the candidate isn't prime. The function Check() also determines 
whether a prime used as a divisor of the candidate exceeds the square 
root of the value being tested. If it does, then the value must be prime 
and 1 is returned. The function Check() returns -1 if all the primes 
passed to it have been used as divisors without an exact division, but the 
largest of them is less than the square root of the candidate. This means 
that more checking is necessary against larger primes. 


As you may remember, a prime is a number with no factors other than 1 
and itself. It is sufficient to check whether a number is divisible by any of 
the primes less than the square root of the number. 


If we've written anything to the file, then this will be indicated by a 
positive value of nRec. The primes in the file need to be used as divisors 
first, because they're lower than those in memory since we compute the 
primes in sequence. 


If nRec is positive, then we read one block of primes from the file into 
the array Buffer() using the function fread(): 
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fread (Buffer, sizeof(long), MEM PRIMES, pFile); 


The count of the number of objects will always be MEM PRIMES since we 
always write the whole array to the file each time. 


If the contents of the file have been exhausted, the function Check() is 
called with the array of primes in memory being passed as an argument. 
If a prime is found then the function Check() will return 1, otherwise 
zero will be returned. 


Checking for a Divisor 
The code for the function Check() is as follows: 


[9**AWETWWEHERWENWERAENEVNTETVWRENNWWERNFWAWENKWERNWWFRENWNEENUVOEENNEERENVEM TEE TV» ad 


* Function to check for division by an array of primes | * 


* Returns 1 if a prime found, zero if not a prime, -1 for more checks * 
(ERROR NAAA NARRA ARA EE AA AAA RRA AA AAA AAA A RARE ERRE RANA AA 


int Check( unsigned long *pBuffer, int Count, unsigned long N ) 
unsigned long *pEnde&pBuffer[Count-1]; _ 
unsigned long RootN=0UL; | 


. RootN=( unsigned ron (1.0«sqrt ( (double)N)) n r le Upper limit for 
| nm ru : pi er. | 


while (pBuffer++ l =pEnd) | 
( S og TU | > 
Le (0% (* (pBuf fer) )==00L) e a * m Check for exact Alvi sien */ 
return 0; | RO a de If so not a pru con vU 


if(*pBuffer»RootN) ^ Check whether divisor seconds | square root */ 
return 1; Eo cue T. Me if so = mast be a prime M | 
(return -1) Qu Pa. uu y More checks. necessary... 72 
Function Analysis 
This function checks whether any of the primes contained in the area 
pointed to by pBuffer, divide exactly into the test value supplied as the 
second argument. Because the computation will be carried out using 


pointers, the function defines a pointer pEnd of type pointer to long, 
which points to the last prime in the block passed as a parameter. 
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The integer variable RootN will hold the square root of the value to be 
tested for primeness, representing the upper limit for divisors to be 
checked against the trial value. Only divisors less than the square root of 
the test value N are tried. 


On each iteration, the pointer pBuffer is incremented to point to the 
next prime. This occurs on the first iteration too, so we automatically 
avoid making the unnecessary check with the first prime which is 2. 
When pBuffer contains the same address as pEnd the loop is terminated, 
since all the primes in the current block have been used as test divisors. 


Checking for exact division is done by dividing the current contents of the 
address pointed to by pBuffer into N. If the result is zero then N isn't 
prime, and zero is returned. If the test division isn't exact, then the 
current divisor is checked to see whether it's greater than the square root 
of N, and if it is then we have a prime and we're done, so 1 is returned. 
Otherwise testing continues with the next prime. 


Outputting Primes 


The last function in our program transfers those primes stored in the 
buffer pointed to by the first argument, to the display screen in lines of 
five. The number of primes to be displayed is specified by the second 
argument. The code for the function is as follows: 
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Each time the loop index i can be divided by mow srzE without a 
remainder, we write a newline character. This ensures that we get 

ROW SIZE values on each line. The format specifier is for unsigned long 
integers in a field width of 12, so that they line up. 


Collating The Parts 


To run the program you need to enter all the functions we've described 
into a single text file, compile and link it. Assuming that you've keyed it 
all in correctly you should be able to get as many primes as your 
computer and your patience will permit. 





Moving Around in a File 


For many applications you'll need to be able to access data in a file in a 
seemingly random order. You can always find some information stored in 
the middle of a file by reading from the beginning, and continuing in 
sequence until you find what you want. But if you have written a few 
million items to the file and you have a few thousand access operations, 
then this may take some time. 


Of course, to access data in a random sequence necessitates that you have 
some means of knowing where the data you would like to retrieve is 
actually stored in the file. Arranging this is a complicated topic, and there 
are many different ways of constructing pointers or indexes to make direct 
access to file data faster and easier. The basic idea is similar to that of an 
index in a book, where you have a table of keys that identify the contents 
of each record in the file that you might want, and where each key has an 
associated position in the file defined where the data is stored. 


We will only cover the basic tools in the library necessary to enable you 
to understand file input/output, and leave further research as a follow-on 
project for you once you get to the end of the book, and achieve the 
status of an accomplished C programmer. 
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File Positioning Operations 


There are two aspects to file positioning, finding out where you are at a 
given point in a file, and moving to a given point in a file. The former is 
a pre-requisite for the latter. If you never know where you are, you can 
never decide where you want to go. 


Accessing a random point in a file can be done regardless of whether the 
file concerned was written in binary or text mode. However, working with 
text mode files gets rather complicated if the system you're using records 
a newline as two characters. This results from the fact that the number of 
characters recorded in the file can effectively be greater than the number 
you actually wrote. 


The problem arises when you think that a point in the file is 100 bytes 
from the beginning. If you subsequently write different data which is the 
same length in memory, it will only be the same length in the file if it 
contains the same number of “wn” characters. For this reason we shall 
sidestep the complications of moving about in text files and concentrate on 
the much more useful - and easier - context of binary files. 


Finding Out Where You Are 


We have two functions to tell us where we are, which are both very 
similar in what they do, but not identical. They each complement a 
different positioning function. The first is £te11() which has the following 
prototype: 


long ftell(FILE *pFile); 


This function accepts as an argument a file pointer, and returns a long 
integer value specifying the current position in the file. This would be 
used with the file referenced by a pointer such as pFile which we've 
used previously, as in the statement: 





The long variable £Pos now holds the current position in the file, and as 
we shall see, we can use this in a function call to return to this position 
at any subsequent time. For a binary file the value is actually the offset in 
bytes from the beginning of the file. 
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The second function providing information on the current file position, is a 
little more complicated. The prototype of the function is: 


int fgetpos( FILE *pFile, fpos_t *pPos); 


The first parameter is our old friend the file pointer, while the second is a 
pointer to a type pre-defined in sTDIO.H, with the type name fpos_t. You 
can look at how the type name is defined in STDIO.H if you want to 
know exactly what it is, but you really don't need to worry about it. This 
function is designed to be used with the positioning function fsetpos() 
which we will come to very shortly. The function fgetpos() will store the 
current position in the file in *pPos. It returns zero if the operation is 
successful, and a non-zero integer value if otherwise. 


Setting a Position in a File 


As a complement to ftel1() we have the function fseek() with the 
prototype: 


int fseek(FILE *pFile, long OffSet, int RefPt); 


The first parameter is a pointer to the file we're repositioning. The second 
and third parameters define where we want to go to, with the second 
being an offset from a reference point specified by the third parameter. 
The reference point can be one of three values which are specified by the 
pre-defined names SEEK SET, which defines the beginning of the file, 

SEEK CUR, which defines the current position in the file, and SEEK END 
which, as you might guess, defines the end of the file. For a text mode 
file, the second argument must be a value returned by fte11() if you are 
to avoid getting lost. The third argument for text mode files must be 

SEEK SET. 


Thus for text mode files, all operations with fseek() are performed with 
reference to the beginning of the file. For binary files you can do what 
you like, as long as you know what you are doing - or even if you don't 
if you like living dangerously. The offset argument in binary files is 
simply a relative byte count. Thus you can supply positive or negative 
values for the offset when the reference point is specified as SEEK CUR. 
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To go with fgetpos(), as we said, we have fsetpos(). This has the 
prototype: 


int fsetpos(FILE *pFile, fpos t *pPos); 


The first parameter is a pointer to the file set up with fopen(), and the 
second is a file position pointer of the same type used in fgetpos(). You 
can't go far wrong with this one really. As with fgetpos(), a non-zero 
value is returned on error. 


The verb seek is used to refer to operations of moving the read/write 
heads of a disk drive directly to a specific position in the file. This is how 
the function fseek() gets its name. With a file that you have opened for 
update, by specifying the mode as “rb+” or “wb+” for example, either a 
read or a write may be safely carried out on the file after executing either 
of the file positioning functions, fsetpos() or fseek(), regardless of what 
the previous operation on the file was. 


A Random File Access Example 


To exercise our new found file handling skills we could write a simple 
example which will write a series of names and addresses to a binary file, 
and then retrieve them in alphabetic sequence by using the names and 
associated file positions which we record in memory. 


We will be making some simplifying assumptions to keep the number of 
lines of code down. First of all we won't be worrying about errors, but 
you know by now that it's important to check for them in your programs. 
Secondly we will accept any single string for a name, an address line, or 
a phone number. In practice you would need to handle first names or 
initials as well as surnames, and you would probably want to validate the 
phone number. Of course a real application would also have much more 
data in a record, but at least we will see how randomly accessing a file 
works. 


Designing the Program 


Let's start by deciding how it will work as a whole. We can read all the 
records for people, write them to a file, and then construct an alphabetical 
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index, using a linked list. The records can then be displayed in the sequence 
determined by the index. First though, we will need a structure to store the 
personal data. 








typedef struct person == /* Person structure type definition */ 
char Mama laol n Mu UU Gs tea 
char Address [5] [40]; to five lines */ 


char Ona L407) a i n ber as string */ 
)Person; a — i m c m c 


This shows how we can define a structure type, person, and the type 
Person as equivalent to struct person. This will allow us to define 


variables of type Person without using the keyword struct. Therefore the 
declaration: 


Person aPerson; 
is equivalent to the statement: 
struct person aPerson; 


We can also define a structure and a type for members of the linked list 
containing the index to the file in a similar way: 





typedef struct index uu r Linked Mes element ead the index */ 
char Name [4017 v 1° Name ot person ^ | 
struct index *Next; -= 7* Pointer to next index entry */ 
long fPos; Qo o m Position €. the Son d 

)Index; mu | | | 


Here we have defined the structure index, and the type Index which is 
equivalent to struct index. Note that you cannot use the type Index in the 
definition of the member Next in the structure index, because Index is 
not defined at that point. 


Let's assume that all the input processing will be done by a function 
ReadPeople(), and that function will use Insert() to construct the linked 
list which will be the index. We can also assume a function Display() to 
write the details of a Person object to the screen. Based on these 
assumptions we can write the function main(): 
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We first have the necessary include statements, a definition for a symbol 
FILENAME for the name of the file to be used, the structure definitions and 
the function prototypes. You should change the definition of FILENAME if 
necessary in your environment. The function ReadPeople() will return a 
pointer to the first element in the index to the file. The function Insert () 
will accept a pointer to the new index entry, and a pointer to the first 
element in the list. 


The function main() declares a pointer to the first element in the index, 
pHead, a Person object, aPerson, and a file pointer prile. The first action 
in main() is to call the function ReadPeople(). This does almost all the 
work. Once this function has been executed, the file will have been written, 
the index will have been created and a pointer to the index is returned. The 
file is then opened in read mode as a binary file. 


The file is read in the while loop, and within the loop, the £Pos member 
of each entry in the index list is used to position the file using the function 
fseek(). Each Person record that is retrieved is output using our function 
Display(). When the Next pointer for an entry in the linked list is NULL, 
we have processed all the entries so the loop ends. The file is then closed 
and deleted from the disk. 


Processing the Input 


All the input processing, including writing the file and generating the index 
is managed by the function ReadPeople(): 


do 


.* Function to read a person's details, construct * 
* a Person object and write it toa tiie. M * 

* construct an index to the file. * | 
oe date 


Index *ReadPeople (void) 





|». Person aPerson; i s Z4" ee store a person ^o QI 
Index *pIndex=NULL; | /* Pointer to an index = 4 Sf 
Index *pHeadsNULL; ^ s /' Pointer to the first list element TL 
FILE *pFile; — o0. 07 sf Pile pointer to person file M D 
int i=0; a | T. E woop counter "7 | 


prLlenfopen (FILENAME, "wb"; ” Open | or create a file in a... mode “ 





fora) Qu rn a um Read data maiz: a blank name ^ 
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The call to fopen() will open an existing file at the beginning, so if you 
have a file of the same name you need to change the FILENAME symbol 
definition if you don't want to overwrite it. All input processing is done in 
the infinite £or loop. Within the loop, the name of a person is first read 
directly into the structure member aPerson.Name. We leave the loop using a 
break statement if an empty name string is entered. 














Linked List 








Insertion 





An index entry is created on the heap, and the Next member is set to 
NULL. If the pointer pHead is still NULL then the linked list for the index 
must still be empty, so we assign the address of the index entry which is 
stored in pIndex, to pHead, so that it becomes the head of the list. If the 
list isn't empty, we insert the new element pointed to by pIndex in the list 
by calling our function Insert(). Because each element will be inserted in 
alphabetical sequence, the new element could be added at the head of the 
list, so the function Insert() returns a pointer to the head of the list once 
the new entry has been inserted. 


The function ReadPeople() then reads up to five lines of address, and the 
phone number, each being stored in the Person object. Having read all the 
data for a person, we are ready to write it to the file, but first we need to 
obtain the current file position by calling the library function ftel1(), and 
storing it in the index member £Pos. After writing the Person object to the 
file, we repeat the process for the next person. 


When all data has been read, the file is closed, and the pointer pHead is 
returned to main(). 


Inserting into the Linked List 


The function Insert(), places a new structure of type index into the linked 
list in alphabetical sequence. The code for the function is: 


[RERAAARARA HAN ARERARAERARA RAR RANA RA RARA RA REN EERE 


| * Function to insert a new element in a list * 
"o * dn alphabetical sequence | "ow 
III IE BOE DE AIE IOI IIIS IOI IITA / 
: Index *nsert ( Index fecum Index — 

q T T t 
Index *pNext; 





/* Pointer to next list element wi 





DUM Check | ko. see if he b sio should go at the front */ 
H (atremp (pIndex-»Name, omm d eee <=0 ,. a 
a 





_pIndex->Next=plead; - | | o get the current head as next */ 
A n | return Tren on | . 4* Return the new head */ 
uH REN TOU tur ut /* Start pNext at the beginning */ 
(atone senesine) 2 00 d Stop if the next is NULL */ 
qu T e tor insertion da front of the naxt one */ 





t strcmp (pIndex- »Name, pNext ->Next ->Name) <=0) 
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Inserting a new index entry at the beginning is quite straightforward. We 
just set the Next member of the new entry pointed to by pIndex, to point 
to the existing head of the list, pHead, and return the address of the new 
entry as the new head of the linked list. 


Inserting the new entry in the middle might seem a bit confusing, but the 
diagram below should help to make clear how it is done. 


plndex 







Next 





This is pNext-» Next 


The boxes in the diagram are members of the linked list, and they are 
labeled with the pointers that contain their respective addresses. If the 
Name[] member of pIndex is greater that that of pNext, which is the 
current list element, and less than or equal to that of the next element, 
which is pointed to by the Next member of pNext, then pIndex is inserted 
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between the two. The Next pointer member of pIndex is set to the address 
of the list element following pNext. This address is contained in pNext- 
>Next. The Next member of pNext is then reset to point to pIndex. Finally, 
the original head of the list is returned. 


Display Details of a Person 


This is a very simple function to output the name, phone number and 
address members of a Person object. 





The name and phone number are displayed by the printf() function. The 
lines of the address are displayed by the puts() function that is called in 
the while loop. The loop continues as long as the line count is less than 
five, and the current line is not empty. 


Program Analysis 


If you want to compile and run this example, you just need to assemble the 
functions we have described, and the block of code that includes main(), 
into a single file. The program will prompt for input. When you have 
added all the personal records you want, just press return. The program 
will read back the records from the file in ascending alphabetical sequence 
and display them. 
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Using Temporary Work Files 


Very often you'll need a work file just for the duration of a program, 
which will only be used to store intermediate results and can be thrown 
away when the program is finished. Our program for calculating primes 
that we wrote earlier in this chapter is a good example, because we really 
only needed the file during the calculation. 


We have a choice of two functions to help with temporary file usage, and 
each has its advantages and disadvantages. 


Creating a Temporary Work File 
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The first function will create a temporary file automatically. Its prototype 
is: 


FILE *tmpfile(void); 


It takes no arguments and returns a pointer to the temporary file. If the 
disk is full for example, and the file can't be created then the function 
will return NULL. The file is created and opened for update (“wb+” mode), 
so that it can be both written and read, but obviously you need to do it 
in that order. The file is automatically deleted on exit from your program, 
so there's no need to worry about any messy excess files being left 
behind. You'll never know what the file is called, and since it doesn't 
continue to exist after, it doesn't really matter. 


The disadvantage of this function is that the file will be deleted as soon 
as you close it, effectively meaning that you can't close the file, having 
written it in one part of the program, and then reopening it in another 
part of the program to read the data. You must keep the file open for as 
long as you need to access the data. A practical illustration of creating a 
temporary file is provided by the statements: 





Unique File 








Names 





Creating a Unique File Name 


The second possibility is to use a function that provides you with a 
unique file name. Whether this ends up as a temporary file or not is up 
to you. The prototype for this function is: 


char *tmpnam(char *pFileName) ; 


If the argument to the function is NULL, then the file name is generated in 
an internal static array of type char, and a pointer to it is returned. 


If you want the name stored in a char array that you declare yourself, 
you must pass a pointer to the array as an argument to the function. Your 
array must be at least L  tmpnam characters long, where L_tmpnam is a pre- 
defined constant in STDIO.H. In this case, the file name is stored in the 
array that you specify as an argument. A pointer to your array is also 
returned. 


A Unique File Creation Example 


So if we take the first possibility then we can create a unique file with 
the statements: 


FILE *pFilesNULL; | : /* Declare a file pointer */ 
char *pFileName=NULL; : /* Pointer to a name */ 
pFilesfopen(pFileNamestmpnam (NULL), "wb*"); /* Create the file */ 


Here we've declared our file pointer pFile, and our pointer, pFileName, 
which will contain the address of the temporary file name. We have 
combined the call to tmpnam() with the call to open the file by putting 
the assignment as the first argument to £open(). Because the argument to 
tmpnam() is NULL, the file name will be generated as an internal static 
object whose address will be placed in our pointer pFileName. 


Don't be tempted to write: 


pFilesfopen(tmpnam(NULL), "rb*"); 


301 


Chapter 8 - File Operations 


If you do then you'll no longer have access to the file name, so you won't 
be able to use remove() to delete the file. 


If you want to create the array to hold the file name yourself, you could 
write: 





Remember that the assistance we've obtained from the library function 
tmpnam() is just to provide a unique name. It's your responsibility to 
delete any files created, and you should also note that you'll be limited to 
a maximum of TMP MAX unique names from this function in your program. 
The symbol TMP MAX is defined in sTDIO.H, and is usually 65535, which is 
more file names than most people will ever need in one program. 


File Error Functions 


In addition to the error return values that we've seen for many of the file 
input and output functions, there are four functions provided in the 
standard library for detecting when error flags are set, and for resetting 
them. 


The feof Function 


The first of these, function feof, has the prototype; 
int feof(FILE *pFile); 
This function returns a non-zero value if the end of file indicator for the file 


pointed to by pFile, is set. This indicates that the file position is at the end 
of the file. The function returns 0 if the end of file indicator isn't set. 


The ferror Function 


To detect if an error has occurred with a previous file operation, you can 
use the function ferror() with the prototype: 
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int ferror(FILE *pFile); 


This function returns a non-zero value if an error has occurred, and O if 
otherwise. 


Error Numbers 


An integer value, errno, which is defined in ERRNO.H, provides a number 
by which to identify a particular error. A message corresponding to a 
particular value for errno can be retrieved by using the function 
strerror () which is declared in the header file STRING.H. This function has 
the prototype: 


char *strerror(int ErrorNumber); 


The function returns a pointer to a string containing an implementation- 
dependent error message corresponding to the error number passed as an 
argument. You can also pass errno directly to the function if you wish. 


Printing an Error Message 


If you just want to output an error message, then the standard library 
provides the function perror(), which has the prototype: 


void perror(const char *pMyMessage); 


This will output to stderr the string pointed to by pMyMessage followed by 
the implementation-dependent error message corresponding to the current 
error. So you could write: 


char *pMessages"Abandon hope. " 
if(ferror(pFile)) | 
perror (pMessage) ; 


to output “Abandon hope.” followed by whatever error message corresponds 
to the current file error. 
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The Clearerr Function 


Finally, to clear any outstanding error indicators you can use the library 
function clearerr() which has the prototype: 


void clearerr(FILE *pFile); 


This will clear any error indicators set for the file, as well as the end of file 
indicator too. Executing a rewind() operation for a file will also clear the 
error indicators. 


Summary 


In this chapter we have covered the device-independent file operations 
provided by the standard library. These functions are available to support 
file operations in any ANSI-standard compliant implementation of C. The 
major points arising in this chapter are: 
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Before you can use a file, it must be opened by calling the function 
fopen(). This establishes a link between a file pointer in your 
program, and the physical file on your storage device. 


The file mode is established when a file is opened. A file can be 
opened in write mode, append mode, read mode, or update mode, 
and can be a binary file or a text file. 


The most elementary file operations provided by the standard library 
enable you to read and write one character at a time using the 
functions fgetc() and fputc(). 


You can return a character that's just been read using fgetc(), back 
to the buffer. You can do this by using the function ungetc(). The 
character EOF cannot be returned. 





Files other than those associated with the keyboard and screen are 
automatically buffered in memory. 


Formatted read and write operations are provided by the functions 
fscanf() and fprintf(). 


Binary read and write operations are provided by the fread() and 
fwrite() functions. 


You can randomly access records in a file. A file position is recorded 
by the function ftell() relative to the beginning of the file and can 
be recovered by the function fseek(). 


You can also position a file using the function fseek() by specifying an 
offset relative to the current position, or to the end of the file. You can also 
use the function fgetpos() to record a file position and subsequently 
restore it with the function fsetpos(). 


The current position, the end of a file and the beginning of a file are all 
defined by the standard symbols SEEK CUR, SEEK END, and SEEK SET. 


You will find there are alternatives to the standard library functions for 
input and output in some environments. This is most notable under the 
UNIX operating system where access to input and output operations (and to 
other services, including those provided by the standard library functions) 
can be gained through system calls. You will often be able to gain some 
improvement in performance in your programs by using UNIX system calls 
directly, but at the expense of limiting your programs portability. 
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Programming Exercises 


1 write a program to store proverbs or sayings in a file, and then 
retrieve all those containing a given word or sequence of words. 


2 Extend the program from Exercise 1, providing add and delete 
operations too. 


3 Write a program to scan a C program file to count the number of 
times a set of given keywords has been used. 


4 write a program to scan a C program identifying symbols in #define 
commands, then generate a new program file with substitutions for 
the symbols and with the #define commands deleted. 
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The Pre-processor and Debugging 


In this chapter we'll explore what facilities the pre-processor provides and 
how they are used. We will also be looking at how the standard facilities in 
C can help you to debug your programs. By the end of this chapter you 
will have learnt: 


How the #include command operates. 

How the #define directive is used. 

What a macro is. 

How to define and use macros with parameters. 
What logical pre-processor directives are available. 


How to use pre-processor directives to avoid accidentally duplicating 
code in your program. 


What assertions are and how you can use then to help debug your 
programs. 


What the common causes of bugs in your programs are. 
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The Pre-processor 


The pre-processor is a program which is executed prior to the compilation 
of your C program source code, providing a means for you to manipulate, 
modify, and augment your C source code. The pre-processor is controlled 
by means of commands, or pre-processor directives, inserted in your source 
code, all of which must have a hash (+) character as the first non- 
whitespace character on the line. We've already used two of these quite 
extensively, the #include directive which inserts the contents of a file into 
your source code, and the #define directive which defines a string to 
replace a symbol used in your program. 


You need to keep in mind that all pre-processor operations occur before 
your program is compiled. They modify the set of statements that 
constitute your program. None of the pre-processor directives in your 
program remain after pre-processing is complete, so they're not involved 
in the execution of your program at all. 


Including Files in your Program 


You have already used this particular directive several times, but lets go 
through it from the beginning so that we're crystal clear as to what's 
happening. The #include directive will insert the specified text file into 
your program at the point where it appears. 


Including Standard Header Files 
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This is most frequently used to add the contents of standard header files 
to your program, providing a whole host of declarations and definitions to 
enable you to use standard library functions. By now, you'll be completely 
familiar with statements such as: 






ET 


which fetches into your program the header file supporting input and 
output functions from the standard library. This version of the #include 
directive searches for the file that has the name specified between the 
angled brackets, and inserts its contents into your program source file in 
place of the #include directive. 


Including 














Files 





Most compilers will use a specific directory to store their include files, and 
you may need to specify this in compiler commands, as an environment 
variable, or by using a dialog box if you’re using a compiler with a 
graphical development environment. 


Including Your Own Files 


There is another form of the #include directive that is used to add the 
contents of one of your own files to a program. This uses double quotes 
instead of the angled brackets. For example: 





The difference between this form and that using angled brackets lies in 
which directories are searched for the required file. This form of #include 
directive will first search the directory containing your source file, and if 
the file isn’t found in the source directory, then it behaves the same as the 
form using angled brackets. 


Including Strategies 


Although include files are frequently given names with the extension .H, 
you can call them whatever you like. You can use the #include 
mechanism for dividing your program into several files, which makes a 
large program much easier to manage. Files containing global definitions 
such as function prototypes, global variables, and symbol definitions are 
usually given names with the extension .H. Files containing function 
definitions are commonly given the extension .c. 


By putting all your symbol definitions into a single file you can make 
sure that the same symbols are used throughout the program, and same 
applies to definitions of structures that you want to be global. With one 
definition used by all program files you ensure consistency, minimize 
errors, and have the ability to make modifications easily. Another use for 
a header file is to group together all the standard library #include 
directives, so that they can all be included into any source file by using a 
single #include directive. 


You need to avoid duplicating information when you include more than 
one file in your program, though. A file that you include in your program 
can also contain an #include directive for another file, as is illustrated 
here: 
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CODE.H 















include <stdio.h> 


MAIN.C 
#include <math.h> 


include <code.h> 


UNNECESSARY 
DUPLICATION 
OF FILES 





include «utils.h» Pier MTESA 


include <math.h> 


#include <stdio.h> 


With large programs, the file structure defining the complete program can 
become quite complicated, and there’s considerable potential for 
introducing a file more than once. Both the duplicated standard header 
files are included in both of the subsidiary .H files, so MAIN.C can 
potentially contain two copies of each of the .H files. 


Duplicate code will usually cause compilation errors so it’s essential to 
prevent this from happening. We shall see later in this chapter how the 
pre-processor provides some facilities for ensuring that any given block of 
code will appear only once in your program, even if you inadvertently 
include it several times. 


Substitutions in your Program 


Pre-processor directives that make substitutions in your source code are 
called macros. The simplest kind of substitution or macro that you can 
specify is one that we’ve already seen, that of defining a string to replace 
a symbol. For example, the pre-processor directive to substitute the actual 
numeric value for the character string Pr, is as follows: 





312 











Other than PI actually looking like a variable, it has nothing whatsoever 
to do with variables. Here the identifier Pr is a token used to represent 
the sequence of digits appearing in its definition. The token, Pr, is used as 
a marker for where the sequence of digits is to appear in your program, 
and will be exchanged for the specified sequence of digits. 


The pre-processor searches your program for all occurrences of ‘Pr’, and 
replaces each with the text 3.14159265. When your program is ready to 
be compiled after pre-processing has been completed, the symbol Pr will 
no longer appear, having been replaced in every case by the numeric 
value. As you've already seen, this has the advantage that you can modify 
the value for all occurrences of PI in a program, just by altering its 
definition in the #define directive. 


Range Restrictions 


The pre-processor, however, won't search strings between quotes. If you 
have a statement such as: 


then the PI appearing in the string between double quotes won't be 
changed by the pre-processor, so the statement will become: 


It will also fail to change strings embedded in identifiers or keywords, so a 
variable with the name PINnumber wouldn't be affected by our definition 
for PI. 


The const Alternative 


Although you can use the #define directive for numeric constants such as 
Pi, it’s preferable to use const variables to do this. You can achieve the 
same result with a global statement such as: 





ewbla PL. Sn 


This leaves no doubt as to the type of constant being used, reducing the 
possibility of the value being misinterpreted by the compiler. 
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The #define Directive 


The general form of the #define pre-processor directive is: 





Here the identifier conforms to the usual definition of a C identifier 
that we discussed back in Chapter 2. The sequence of characters is 
optional, and is used to define a token which has a value, as opposed to 
just defining the existence of a token. 


It’s a convention amongst C programmers that identifiers created using 
#define are given names in capitals, so that they stand out. This is 
especially useful where identifiers look like functions, which we'll look at a 
little later on. 


Using #define 


A very common use of the #define directive which again we've already 
seen, is to define array dimensions. When this is done, only one directive 
needs to be modified, both to alter the dimensions of the array, and the 
behavior of statements, such as loops, which use the array dimension as a 
control. 


Nesting Substitutions 
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Substitutions can also use symbols that have been defined by other 
substitutions. In Chapter 8 we wrote a program to produce an arbitrary 
number of primes. We defined ROW_SIZE for the number of primes to 
appear in an output line, and MEM PRIMES as the dimension of the array 
to hold primes in memory. We said at the time that ideally MEM PRIMES 
should be a multiple of Row srzE to ensure a tidy output. We could have 
enforced this with the directives: 





Now MEM PRIMES will definitely be a multiple of Row srzr. The 
parentheses in the substitution string (20*ROW SIZE) are a safety 
precaution against side effects. 
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Disregarding Context 


The pre-processor is making string substitutions without regard for context, 
and the context can sometimes produce something different from what you 
want. This is usually caused by operator precedence. 


This is easier to see if we use a slightly different example. Suppose we 
define VALUE with the directive: 


ddefine VALUE 10415 


Later in the program we will have an expression, 2*VALUE which we 
would expect to give the result 50. In fact, it produces 35 because the 
expression will end up as 2*10+15, which when you see it is clearly 35. If 
we insert parentheses around the substitution string: 


 $define VALUE (10415) 


we would have obtained what we were wanting. It's a good idea to 
always use parentheses around any substitution string other than a 
constant. 


Macro Substitutions 


We can define a macro using the #define directive, which will accept 

arguments rather like a function. This allows different arguments to be 
specified in various instances of using the macro, and these arguments 
will replace the corresponding tokens in the macro's substitution string. 


A Simple Example 


We will be able to better understand this by looking at an example: 
 Wdefine PRINT(INTVALUE) printf("*d", INTVALUE) 

This directive provides for two levels of substitution. There is the 
substitution for PRINT(INTVALUE) by the string immediately following it in 


the #define statement, and there's also the substitution of alternatives for 
the parameter INTVALUE. For example, you could write the statement: 
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This will be converted by the pre-processor to: 





You could use this directive to create a printf() statement for any 
variable or constant of type int at various points in your program. 


Macros with Multiple Arguments 


The most general form of the #define directive that accepts an argument 
can be represented as: 








The 1ist of identifiers represents one or more parameters separated 
by commas. This shows that in the general case multiple parameters are 
permitted, so we're able to define more complex substitutions. 


Note that you mustn't leave a space between the first 
identifier and the left parenthesis, because the identifier is 
terminated by the first space. 





An Example 
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To illustrate how you use this sort of definition, we can define a macro 
for producing the maximum of two values with: 





This has two parameters x and y, which will be replaced in the 
substitution string for the macro by whatever arguments are specified 
when it's used. Therefore, we could use the macro with the statement: 





This will be expanded by the pre-processor into the statement: 





Pitfalls with 
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This will calculate the maximum of the two values specified as arguments 
to the macro, as we expect. This is a very useful macro, as it’s not type- 
dependent. As long as the arguments are of the same type, this will work 
with any type of arguments. To implement this as a function we would 
need a separate function with a different function name for each type of 
argument we wanted to handle. 


Pitfalis with Macros 


There is a trap hidden in the last macro. It’s important to be conscious of 
the substitution that’s taking place, and not to assume that this is a 
function. Otherwise you can get some really strange results, particularly if 
your substitution identifiers include an explicit or implicit assignment. For 
example, the following modest extension of our last example can produce 
an erroneous result: 


Result=MAX (MyValue++, 99); 
The substitution process will generate the statement: 
Result=MyValue++>99?MyValue++:99; 


What happens as a result of this statement now depends on whether the 
variable MyValue is greater than 99 or not. 


If the value of MyValue is less than or equal to 99, then the variable 
Result will be assigned the value 99, and Myvalue will be incremented. If 
MyValue is greater than 99, then the variable Result will be assigned the 
value stored in the variable Myvalue, but in this case MyValue will be 
incremented twice. There is no way to protect against this other than to 
always use capital letters for your macro names, to provide a visual clue 
that they aren’t real functions. 








You must just remember not to use the increment or 
decrement operators, or any other expressions that modify 
variables, as arguments to a macro which may cause the 
expression to be evaluated more than once. 











317 


Chapter 9 - The Pre-processor and Debugging 





Precedence Rules 


You need to be aware that precedence rules can also catch you out with 
macros accepting arguments, in much the same way as they did with 
simple substitutions. We can demonstrate how this can occur with an 
example. Suppose we write a macro to calculate the product of two 
arguments: 





We then use this macro with the statement: 





Of course everything works fine but we don't get the result we want, since 
the macro expands to: 





It could take a long time to discover that we aren't getting the product of 
the two parameters, as there's no external indication of what's going on. 
There is just a more or less erroneous value propagating through our 
program. 


Using Parentheses 
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The solution is very simple. If you use macros to generate expressions, put 
parentheses around everything, especially the individual parameters. For our 
example to work properly every time we need to rewrite our example as: 





The inclusion of the outer parentheses here may seem excessive, but since 
you don't know the context in which the macro expansion will be placed, 
it's always better to include them, and it doesn't cost anything. Our 
previous macro for the maximum of two arguments would be much better 
written as: 





Now a statement such as: 





Shorthand 
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will work as we want, whereas it wouldn't have worked with the previous 
version. 


Verbose Macro Expansion 


You also need to be aware that expansion of macros can result in large 
amounts of ugly code, which can be horrible to debug. For instance, using 
the definition of MAx(x,y) from a couple of lines ago, the expression: 


Result=MAX (MAX(1,2), MAX(3,4)); 
would expand into the rather horrendous: 


Results i i 
((((1)>(2)? (1) 1 (2)))>(((3)>(4)? (3) : (4)))? (((1)>(2)?(1) :(2))) 1(C()» (4)? (3): (4)))); 


So be very careful! 


Macros as Shorthand 


If you have an expression of some complexity that you use quite frequently, 
you can sometimes use a macro in order to reduce the amount of typing. It 
can also simplify your code and make it more readable in some instances. 
For example, as we saw in Chapter 8, the standard library function fgets () 
has the merit that it will read a string from an input stream, but the input 
is limited to the number of characters specified by its second argument. You 
will recall that it has the prototype: 





char *fgets(char *pStr, int nChars, FILE *pFile ); 

This function is often used to read a string from stdin, because it can 
ensure that the number of characters read doesn't exceed the length of the 
array receiving the input. It returns NULL if an end-of-file occurs, and pStr 
if otherwise. It can be more conveniently used in a macro such as: 





#define GetLine(pStr,N) ((fgets(pStr,N, stdin)==NULL) ?EOF:strlen(pStr)) 
The macro GetLine() will now read a maximum of N characters from 
stdin into the array pointed to by pstr. It will also generate a value which 
is either the length of the string if the operation is completed without error, 
or EOF if an end-of-file condition is recognized. 
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You may find other circumstances where you can usefully package other 
standard library functions inside a macro. 


Strings as Macro Arguments 


String constants are a potential source of confusion when used with 
macros, so let's start with the most elementary case, and work our way 
up. The simplest string substitution is a single level definition such as: 





If you now write the statement: 





then this will be converted by the pre-processor into the statement: 





which should be what you're expecting. You couldn't use the #define 
statement without the quotes in the substitution sequence, and expect to be 
able to put the quotes in your program text instead. For example, if you 
write: 





there will be no substitution for MvsTRING in the printf() function. 
Anything in quotes in your program is assumed to be a literal string, and 
as we saw early on in this chapter, the pre-processor won't modify it. 


Using Double Quotes 


A special technique is provided for indicating that the substitution for a 
macro argument is to appear between a pair of double quotes to form a 
string. For example, you could specify a macro to display a string using 
the function printf() as: 
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The # character preceding the appearance of the parameter, STRING, 
indicates that the argument is to be surrounded by double quotes when 
the substitution is generated. Therefore if you write the statement in your 
program: 





this will be converted by the pre-processor to: 





This mechanism ensures that the substitution of the argument always 
results in a string between quotes. It also provides the possibility of 
substituting the argument with and without quotes. For example, if we 
wanted to display the value of a variable, and show its name in the 
output, we could write the macro: 





ntValue, IntValue); 


Now if you use the statement: 





the pre-processor will convert this into the statement: 





Which will output the name and value of the variable nData. 


Using # in a macro substitution string also enables you to generate a 
substitution producing a string that includes double quotes. If you write 
the statement: 





it will be pre-processed into the statement: 





This is possible because the pre-processor is clever enough to recognize 
the need to put \" at each end in order to get a string including double 
quotes displayed correctly. 


321 


Chapter 9 - The Pre-processor and Debugging 


Joining Two Results of a Macro Expansion 


The pre-processor will allow you to generate two results in a macro and 
join them together without spaces between them. To join two macro 
arguments into a single sequence of characters, you can specify the macro 
as: 





The two characters, ##, work as an operator to separate the parameters, 
and they indicate to the pre-processor that the results of the two 
substitutions are to be joined. For example, writing the statement: 





will result in: 






PASSA DORRIT OO OR OASIS 


This might be applied to synthesizing a variable name, building up a fully 
qualified path name from file and directory names, or generating a format 
control string from two or more macro parameters. 


Pre-processor Directives on 
Multiple Lines 
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A pre-processor directive must be a single logical line, but a logical line 
can be extended to multiple physical lines by using the statement 
continuation character, \. We could write: 





Here, the backslash indicates that the directive continues on the second 
line with the first non-blank character found, so you can position the text 
on the second line wherever you feel it effects the nicest arrangement. You 
can spread a directive over as many continuation lines as you wish by 
using the backslash repeatedly. You can use the backslash with C language 
statements, too, although it generally isn’t necessary. 
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Logical Pre-processor Directives 


The last example we looked at appears to be of limited value, since it’s 
hard to envisage when you would want to simply join var to 123 - you 
could always use just one parameter, and write var123 as the argument. 
One aspect of pre-processing facilities that adds considerably more 
potential to such tasks, is the possibility for multiple macro substitution, 
where the arguments for one macro are derived from substitutions defined 
in another. In our last example, both arguments to the join() macro 
could have been generated by other #define substitutions or macros. 


The pre-processor also supports directives which provide a logical if 
capability, enabling you to make decisions about which directives are 
executed, or whether a block of code is to be included as part of your 
program. This vastly expands the scope of what you can do with the pre- 
processor. 


Conditional Pre-processing 


The pre-processor supports a conditional directive #if, which tests a 
constant integer expression, and if the expression that is tested is zero, 
subsequent lines of the program up to the point where an #endif directive 
is found are skipped, and therefore not included in the program. If the 
expression is non-zero, then the following program code is included and 
processed normally. There are various ways in which the #if pre-processor 
directive can be used, so let's look at them in turn. 


Conditional Compilation 


The first version of the #if directive we shall discuss allows you to test 
whether an identifier exists as a result of having been created in a 
previous #define statement. The general form of this test is: 





This can also be written as: 





323 


Chapter 9 - The Pre-processor and Debugging 


Which is exactly the same as the previous version. If the specified 
identifier hasn't been defined, then all the statements following the #if 
are excluded from the program, until we reach the statement: 





If the identifier has been defined, then all the following statements will 
be included in the program. This is the same logical process that we've 
used in C programming, except that here it results in the inclusion or 
exclusion of program statements. 


Testing for Identifier Existence 


You can also test for the existence (or absence) of an identifier. The 
general form of this directive is: 





This can also be written in an abbreviated version: 





Here the statements following the #if down to the #endif will be 
included in the program, if the identifier hasn't previously been defined. 
This provides you with a general method to avoid including functions, or 
other blocks of code and directives, in your programs more than once. 


Protecting Include Files 
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When you have a program consisting of multiple files, you may end up 
with several #include statements referring to the same file, and you need 
to ensure that the file is only compiled once. A common way to do this is 
to use a sequence of directives as follows: 





The first time the file containing these directives is included in a program, 
the identifier block1 won't been defined. Therefore the block following the 
#if will be processed and block1 will then be defined. The following 
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block of code down to the #endif will also be included in your program. 
For any subsequent inclusion of the same file in a program, block1 will 
already be defined. As a result, all the statements in the file down to the 
#endif won't be included. Of course, any directives or statements 
following the #endif will be processed in the normal way. 


Its a good idea to get into the habit of automatically protecting code in 
your own files in this fashion. Once you have collected a few files 
containing your own functions, you will be surprised how easy it is to 
end up duplicating blocks of code accidentally if you don't, and using the 
conditional directive to protect against this costs absolutely nothing. 


Omitting Code using #ifdef 


You can also use the #ifdef directive to temporarily comment out blocks of 
code in your program, as follows: 


/* All code here will be omitted fron the program */ 
#endif | | | 


To include the code you just need to remove the two directives. Of course, 
you could also use an identifier to control whether the code is included or 
not, in a similar way to that used to avoid code duplication. 


Note that even if part of your program is ‘turned off’ using #ifdef, all the 
text in the block needs to be valid C, or else you'll get compiler errors. 
You need to be especially careful of things like unterminated comments and 
strings, and newlines within comments. 


Using Multiple Tests 


You aren't limited to testing just one value with the pre-processor #if 
directive. You can use logical operators to test whether multiple identifiers 
have been defined (or not defined). For example, the statement: 





#if defined blockl && defined bl dub — — —— 


will evaluate to True if both block1 and block2 have previously been 
defined, and so the following code won't be included unless this is the case. 
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Rescinding a Definition 


A further extension of the flexibility in applying the pre-processor 
conditional directives, is the ability to undefine an identifier you've 
previously defined. This is achieved using a directive: 





Now if blocki had previously been defined, after this directive it's no 
longer defined. One use of this directive is to ensure that a function is used 
for an operation rather than a macro. For example, the directive: 





would ensure that the macro we defined earlier wouldn't be used 
subsequent to this directive. Of course, if you use max() after this point, 
and you haven't provided a function max(), then your compiler will 
generate an error message. 


Testing for Specific Values 
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As we said at the beginning of this section, you can also use a form of 
the #if directive to test the value of a constant expression. If the value of 
the constant expression is non-zero, then the following statements down to 
the next #endif are included in the program. If the constant expression 
evaluates to zero, the following statements down to the next #endif will 
be skipped. The general form of the sif directive is: 





This is most frequently applied to test for a specific value being assigned 
to an identifier by a previous pre-processor directive, but you can also 
compare values, or indeed any constant expression, although you mustn't 
use the sizeof operator, and no type casts are allowed. We might have 
the following sequence of statements: 





The statements between the #if and #endif statements will only be 
included in the program here if the identifier PRINTLINE has been defined 
with the value 132 in a previous #define directive. 
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Creating Error Messages 


If you detect a condition during pre-processing that warrants an error 
message being generated, you can use the pre-processor directive #error to 
generate a message. For example, the directive: 


error We are in deep trouble __ 


will cause a diagnostic message to be generated which will include the file 
name and the line number, and will include the string of characters 
appearing in the #error directive. 


The #error directive is usually used with the +i£ directive to ensure that 
some essential condition is met. An example of this is: 


if (PRINTLINE!=80) && (PRINTLINE te 132) : 
derror PRINTLINE must be set to either 80 or 132 
fendif 


This condition will generate the message if PRINTLINE hasn't been set to 
one value or the other. 


Multiple Choice Selections 


To complement the #if directives, we have the #else directive too. This 
works in exactly the same way as the else statement in that it identifies 
a group of directives or statements to be included in the program if the 
#if condition fails. For instance, the previous example would probably be 
better written as: 


 4if PRINTLINE="132 

/* Code for wide printer */ 
telse 

/* Code for narrow printer */ 
#endif 


In this case, the code for a wide printer will be included if PRINTLINE has 
the value 132, otherwise the code for a narrow printer will be included. 
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The elif Directive 


The pre-processor also supports a special form of the #if for multiple 
choice selections, where only one of several choices of statements for 
inclusion in the program is required. This is the #e1if directive, which 
has the general form: 





This is equivalent to an #else followed by an #if. Here's an example: 





This provides for three possible options. If PRINTLINE is 132 then the code 
for a wide printer is included. If it isn't, then if PRINTER has been defined 
as HPLASER then the code for a laser printer is included, otherwise code 
for a narrow printer is used. 


For multiple choices you can use several successive #elifs to select one 
choice from a wide range of possibilities. 


Standard Pre-processor Macros 


There are five standard macros defined by the pre-processor which you 
can use in your source program statements or in pre-processor directives. 
They each provide a specific item of information. 


Obtaining Date and Time Information 


The macro | pATE will be replaced by a string representation of the 
current date when it's processed in your program. This will be in the 
form "Mmm dd yyyy". Here Mmm is the month in characters, such as Jan, 
Feb, and so on. The pair of characters dd is the day in the month with 
values from 1 to 31. Single digit days are preceded by a blank. Finally 
yyyy is the year as four digits, 1995 for example. 
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A similar macro, __TIME__, provides a string containing the value of the 
time when it's processed, in the form "hh:mm:ss". The string contains 
pairs of digits for hours, minutes and seconds, separated by colons. 


You could use this to record when your program was last compiled with 
a statement such as: 





Once the program has been compiled, the values output by the printf () 
statement are fixed until you compile it again. Therefore on subsequent 
executions of the program the time and date will be progressively 
incorrect. 

















Don't confuse these macros with the time and date functions 
which we saw in Chapter 7. The standard library functions 
will give you the correct time and date each time the program 
is executed. The pre-processor macros provide the correct value 
when the program is compiled, and these values will be used 
each time the program is executed, until it is recompiled at 
some point. 





Accessing the Filename 


You can obtain the name of the source file being compiled as a string literal 
by using the macro , FILE . This can be useful if you're working with a 
large program consisting of many files, some of which may exist in different 
versions at any one time. Recording the name of the file being compiled can 
help keep track of which versions of files are being used. 


You can also use the __FILE__ macro for error messages in your program. 
This is particularly helpful when your program consists of more than one 
file because the error message can pinpoint the file in which the error 
originated. 


Accessing Line Numbers 


The | LINE macro generates the current line number. This can relate 
errors which may arise when the program is executed to a particular line of 
source code. Where the same error can occur at different points in a 
program, you could provide tracking information with statements such as: 
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These statements will output the file name and the line number in the 
source file when an error occurs, followed by the error message 
corresponding to the error. With this information you know exactly where 
the problem was detected. 


Verifying ANSI Standard C 


The last macro is __stpc__ which will be defined as 1 if the compiler is an 
ANSI-standard compiler. As well as ensuring that you use the correct 
compiler for ANSI-standard code, it also offers protection against incorrect 
options being selected in compilers which may support several different 
definitions of the C language. To make sure that the ANSI option is set, you 
just need to put a suitable directive in your program: 





Debugging Your Programs 


You are about as likely to produce a bug free version of a realistic program 
first time out as you are to find hairs on a frog. Thus getting the bugs out 
of your program is going to occupy a great deal of your time. The process 
of testing and debugging a program is normally rather more time 
consuming than the writing of the code in the first place. 














Many programmers who over-estimate their own abilities to 
write error-free applications find that the proportion of 
development time spent debugging to the time spent 
designing the program increases exponentially. It is extremely 
important to spend a little extra time in the design stage to 
iron out any potential problems, because if they persist 
through to the latter stages of development then debugging 
can become very difficult indeed. 
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Debugging is a big topic. It is most important that you are familiar with the 
debug tools provided with your compiler, since they will provide you with 
the most powerful means of finding and eliminating bugs. The standard 
library and the pre-processor do provide some tools which are helpful in 
this context, which we will look at now. 


The assert Macro 


The assert macro enables you to generate diagnostic messages when errors 
occur, and it's defined in the header file AssERT.H. The macro is invoked 
by a statement such as: 





Statements of this kind are called assertions, because you assert that the 
value of expression is True. The argument, expression, must result in a 
value of type int. If the result of evaluating expression is zero (representing 
False), then a message will be output to stderr with the form: 


Assertion failed: expression, file _ FILE ., line _ LINE . 


The argument expression is output as it appears in the original assertion, 
not as its value, which you know must be zero if you see the message. The 
output therefore provides you with a record of what the expression was 
that indicated there was an error, the source file that contains the code 
where the error was detected, and the line number within the source file. 
After outputting the message, the standard library function abort() is 
called, which writes the message "Abnormal program termination." on stderr 
and calls the standard library function exit() with exit code 3. 


You can use any kind of expression in an assert() macro as long as the 
result is integer, but comparisons are the most common. You could check 
that an index to an array falls within the array bounds with statements 
such as: 





This will generate an error message if the index value n is outside the 
limits of the array. 
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Pointer Problems 


A common source of problems is due to a pointer not being set correctly. 
You could check that a pointer in a function has a value that isn’t NULL 
with statements such as: 





You may be wondering at this point why you need this. After all, you 
could program this using regular C statements anyway. This is true, of 
course, but a major advantage of using assert() is that you can control 
whether or not the code is included in the program as well. 


Removing Diagnostic Statements 


Once your program is fully tested, you don't want the program to be 
cluttered up with all these diagnostic statements, making the program much 
larger than it needs to be for one thing, and it may very well make it 
slower too. On the other hand, you don't want to have to go through and 
laboriously delete them all, which is what you'd need to do if you simply 
include output statements executed as a result of an if in your program. 
Equally, if you find you want to alter or extend the program, you may 
want all your diagnostics back in again. To omit the assert() macro 
diagnostics from your program, all you have to do is add the directive: 





to the beginning of the source file. Now none of the assert() statements 
will be included when your program is compiled. Note that NDEBUG doesn't 
need to have a value. It is sufficient if it appears in a #define directive 
without a value. 


Creating your own Error Messages 


In some situations, the fact that the assert() macro terminates the program 
can be a nuisance. You may want to apply some local fix if you detect an 
error, and then let the program stagger on. The most obvious way to do 
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this is to use pre-processor directives to control whether or not the code is 
included. For example, instead of assert() in the last example, we could 
have written: 


Index *Insert (Index *pNew, Index *pHead) 


{ 
#ifdef DEBUG 
if (pNew==NULL) 
{ 
/* code to display a message and fix up the situation * / 
} 
#Hendif 
/* Code to insert the new item */ 
} | 


Here the fix might be to just do nothing after displaying an error message, 
and just returning from the function. The diagnostic code will only be 
included in the program if there's a definition directive for the identifier 


DEBUG, such as: 
#define DEBUG 


There is nothing to stop you having several different groups of diagnostic 
statements, each controlled by their own identifier. You could then switch 
each group on independently. 


Defining Your Own Assert() Macro 


You could also define your own macro for assertions - perhaps Assert () 
with a capital A that just displays the message without exiting from the 
program: 


#ifdef DEBUG 


#define Assert (exp) \ 
o failed: %s, file *;s, line %d",\ 


#Hexp, |. FILE. , |. LINE. )) 
else 
define Assert (exp) /* Define as empty */ 
Kendif 


We have had to use two continuation lines for the definition of the 
Assert() macro here because it's rather long. If DEBUG isn't defined, 
Assert(exp) is in turn defined as an empty statement. If DEBUG is defined 
then this macro defines Assert(exp) as a statement that does nothing if 
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exp is positive. If exp is zero then a call fprintf() is generated to output 
the asserting expression, the file name and the current source line number 
to stderr. 


Common Causes of Errors 


Most errors are due to incorrectly keying in a program and will be picked 

up by the compiler, as they almost always introduce inconsistencies into the 
code. Mistakes are more of a problem when they are occasional rather than 
habitual, since if they're habitual you're likely to immediately identify them 
as a possible source of error. 
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There are a vast number of ways in which you can cause errors in your 
programs. There is a small subset though that accounts for a large 
proportion of the errors, many of which are a consequence of misusing 
pointers. Here is a list of choice candidates for execution-time bugs in your 
program. They are in no particular order: 


Omitting the required & when specifying address arguments to 
functions, particularly with scanf() or fscanf(). 


Using = in a condition where you meant ==. 


Using incorrect format specifiers in a format string, especially on 
input. 


Using a pointer that contains an address of a variable that is out of 
scope. 


Returning the address of a local variable from a function. 


Using a pointer that hasn't been initialized so it contains garbage 
values. 


Using a variable that contains garbage values in an expression. 


Calculating an expression as an integer causing rounding down, 
when you wanted a floating point result. 


Indexing an array outside of its boundaries. 


Forgetting to divide by the size of an element in an array when 
calculating the number of elements - a loop count for instance. 
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Forgetting the break statement in cases for a switch statement. 
Failing to check for error conditions with file operations. 
Failing to check for memory allocation errors. 


Omitting braces round statements to be executed when an if 
condition is True. 


Associating an else with the wrong if in an if-else sequence. 
Letting the precedence rules fool you by not using parentheses. 
Confusing rows and columns in a two-dimensional array. 


Exceeding the maximum value that can be stored in an integer 
variable. 


Writing the expression in a loop condition the wrong way round - 
for example, putting > when you mean <. 


Exceeding the capacity of a char array, particularly on input. 


Summary 


The pre-processor provides powerful augmentation to your C programs. The 
pre-processor is a fundamental supporting mechanism, particularly where 
programs defined in several files are concerned. The important points we’ve 
covered in this chapter are: 


The pre-processor allows you to define symbolic constants that you 
can use in your program. This enables global changes such as 
modifying array dimensions to be easily managed. 


Logical pre-processor directives sif, #telse, #elif, #ifdef, #ifndef 
and #endif enable you to conditionally execute other pre-processor 
directives. 


Logical pre-processor directives enable you to control whether blocks 
of code are included or excluded from your program. You can use 
logical pre-processor directives to ensure that the contents of files 
that you include into your program can’t be included more than 
once. 
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You can use the #define directive to write macros which accept 
arguments. 


Macro parameters should be parenthesized in the definition string for 
a macro, to avoid unwanted effects due to operator precedence. 


Expressions which modify variables should be avoided in macro 
arguments because of the risk of unwanted side effects. The 
increment and decrement operators are a particular source of such 
side effects. 


WE The assert() macro allows you to include conditional diagnostic 
output in your program. It can be omitted from your program by 
defining NDEBUG. 


Programming Exercises 


1 Write a macro Quad(a,b,c,x) to evaluate ax*+bx+c. 


2 Write a macro to generate the absolute value of any numeric argument 
(that is, if it’s negative, make it positive). 


3 Write a macro min(a,b,c) to produce the minimum of a, b, and c. All 
arguments are of the same type. 


4 Write a macro Toupper(c) to produce the upper case equivalent of an 
ASCII character e passed as an argument. The macro should do 
nothing if the character c isn't a lower case letter. 


D Write a macro Bit(x,n) which will result in 1 if the nth bit of the 
integer argument, x, is 1, and zero if the nth bit of x is zero. 
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Portability and Maintainability 





In this chapter we're going to take a look at what aspects of programming 
in C can restrict the porting of your programs from one machine 
environment to another. We will also explore what basic things you need to 
do in order to ease the process of extending your programs, or fixing any 
problems that might arise. By the end of this chapter, you will have an 
understanding of: 


What constraints there are to C program portability. 


What you need to consider to produce reasonably portable code. 


How you should structure your programs to make maintenance 
easier. 


How to approach documenting a program. 


Which programming approaches you can use to ease program testing 
and maintenance. 
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Writing Portable Programs 


You need to put the need for portability in a proper perspective. Always 
writing programs conforming to all the rules that apply to maximize 
portability can be an unacceptable burden. In many situations it may not be 
practical, and if you don't expect your program will have to run on any 
other computer, it can involve you in a lot of unnecessary work. On the 
other hand you don't want to completely ignore it. After all, portability is 
one of the major advantages of C, and so, within reasonable limits you 
should endeavor to make your programs as portable as is practical. This 
need not be difficult or an encumbrance. It's mainly a question of adopting 
programming techniques which avoid dependencies on the machine and 
compiler you're using. 


There are essentially two aspects to making a program portable: 


WE Your program needs to be written to conform to a language standard 
that is available across the range of computing platforms on which 
you might want to run your application, in this case ANSI C. 


MID vou must also write your code to avoid any dependencies on any 
one particular machine architecture, or specific hardware facility. 


The Minimum ANSI Compiler 
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The ANSI standard for C provides you with a standard definition of the C 
language, and definitions of the minimum requirements that such a compiler 
must meet. If your program conforms to the recommendations for 
portability defined in the ANSI-standard C, then your program should 
compile successfully on any system with an ANSI-conforming C compiler. 
Just in case you need them, here are the upper limits applicable to the code 
in your program if it's to be processed by any ANSI C compiler: 


6 characters in an external identifier. 
8 levels of nested #if or #define directives. 


12 (), [], or * in the declaration of a single identifier. 


15 levels of nested control structures. 


ANSI 





Constraints 





31 characters in an internal identifier. 

31 nested parentheses in a single expression. 
31 nested parentheses in a single declaration. 
31 arguments in a macro or function call. 
127 local identifiers in a single block. 

127 nested expressions. 

127 members in a single structure or union. 
127 constants in a single enumeration. 

255 case labels in a single switch statement. 
509 characters in a single statement. 

511 external identifiers in a single source file. 


1024 macros in a single source file. 


32,767 bytes in a single array or structure. 


Most of these constraints are unlikely to trouble you. After all, how often 
do you have more than 31 levels of parentheses nesting in an expression? If 
it’s just once, there’s likely to be a readability problem with your program. 
There is one constraint you might need to take note of though: the 
minimum conforming compiler need only support 6 characters in an 
external identifier. This limit is low because of linker constraints in some 
environments, so if you happen to be using a linker that just conforms to 
the minimum necessary to meet the ANSI standard, it can be very 
inconvenient. 


Portability Constraints 


Perhaps the most serious constraint to program portability is the user 
interface. Most programs these days need to have some kind of graphical 
interface, but you may have noticed that this book is totally devoid of 
graphics programming. This is because the ANSI standard for C only 
provides the ability to interface to a user through the standard library, and 
this is limited to text. 
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Generally, since there's no supported graphics standard, implementing an 
application with a graphical user interface that will run on a UNIX 
workstation, on a PC under MS-DOS, and with Microsoft Windows is likely 
to be a very challenging task. It can be eased, however, by using one of the 
cross-platform graphical user interface tools that are available, and some of 
these have reached an acceptable level of maturity. Success will depend on 
the level of graphics your application requires, and the degree to which you 
need to use advanced graphics hardware facilities will tend to differ from 
one machine to another. 


If you are implementing a graphical application that needs to run on several 
different platforms, your options on strategy for program development are 
limited. Designing the program so that the user interface is as separate and 
independent as possible from the computational and data management parts 
of the program is fundamental. You must also determine at the outset the 
range of computers that it's essential for you to support, and select the 
cross-platform support tool to suit that set. 


Dependencies that arise from the way a particular processor works, can be 
avoided by ensuring you write your program with portability in mind. Let's 
take a look at what kind of things you need to watch out for. 


General Considerations 
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A general requirement for a C program to be portable is that it should 
conform to ANSI-standard C. This may seem to be self-evident but it does 
imply some significant restraint on what you can use in the typical C 
development environment. It means that you can only use the ANSI- 
standard library functions, and that all the extra goodies that come with 
even the lowest priced commercial development environment for C must be 
avoided. 


If you have library functions in C source code form, you can then 
incorporate them into your program as source code, so that they're fully 
integrated, removing the dependency on an external library. This presumes 
that the license agreement for the library functions allows you to use it in 
this way. 
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Of course, if you have a program that uses some non-standard functions 
and you don’t have source code for them, you can always try to produce a 
version of your own, although this may involve a significant amount of 
work, and if the functions support specific hardware operations, this may 
not be very easy. 


Standard Header Files 


The standard ANSI-C header files contain a lot of definitions for common 
programming constants, such as values for EOF and NULL. This is to allow 
the implementation of C in different environments to provide a common 
interface to such values, and you should always, therefore, use the 
identifiers defined in the standard header files for standard constants. If you 
use EOF for end-of-file, and NULL for setting a pointer to a value that 
doesn’t point to anything, you can be sure that you'll be using values that 
are correct for any environment. 


The same applies to the data types defined in standard headers. The 
operator sizeof produces a result with a value of type size_t. In most 
cases using an int will work, but there will also be cases where it won't. 
Its no hardship at all to use type size t to declare variables concerned 
with values returned by sizeof, and if you do so, you have the security 
that it will always work as intended. 


Avoiding Computer Architecture 
Dependencies 


We can look at programming for portability by considering which aspects of 
the architecture of a computer have the potential to cause problems 
depending on how your code is written. It goes without saying that if you 
write graphics code which directly addresses processor registers, or uses 
hard-coded addresses for specific operating system data, then your program 
will definitely be nailed down to whatever machine you are using. However, 
there are some hardware dependencies that can creep in without you being 
aware that it's happening, so we're going to be taking a specific look at 
these: 
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The representation and processing of characters 


The representation of numerical values 


The upper and lower limits for numerical values 


Variations in word sizes 


Avoiding Character Code Dependencies 


You need to avoid making assumptions about what the codes are that 
represent particular characters, and about what the relationships between 
various codes are A rather obvious example of code that reduces portability 
is: 





The condition in the if is dependent on the code being ASCII. This code 
wouldn't work correctly on any machine that used a code with a different 
numeric value for N. The correct approach is to use the symbol rather than 
the character code: 





You also need to avoid constructions that presume character codes for 
letters are assigned contiguously. An evident mechanism to be avoided is 
the incrementation of a letter to generate the next in sequence. A slightly 
less obvious construction that needs to be avoided, is the use of expressions 
to index an array with a built in assumption that letters are represented by 
contiguous codes. An example of this sort of thing is: 





Avoiding 


Dependencies 





This fragment is intended to create a count of how often each letter appears 
in the string pointed to by pstr. This will work fine as long as the codes 
for upper-case letters are contiguous, like they are in the case of ASCII. If 
they aren't, then the index expression can produce values outside the 
bounds of the array. An easy way to achieve the required result without 
this dependency, is to use a switch statement: 





This version will work with any code representation for letters, regardless of 
whether they're a contiguous set of codes or not. If you use the standard 
library functions as far as possible for converting and testing characters 
rather than writing your own versions, you will minimize the risk of 
writing code that is dependent on a particular character set for correct 


operation. 





Of course in this day and age, nearly all computers will use 
the same representation, so whether you act on this guideline 
is up to you. For your program to be truly portable, you will 
have to think long and hard about using it. 






You need to be a little careful about converting characters stored as type 
char to other integer types. For characters with codes greater than the 
decimal value 127 the sign bit will be set. What happens when the character 
is converted to int or long is implementation-dependent. In some 
environments the conversion will treat the char value as signed, and 
extend the sign bit so that the senior bits in the word will be set to 1. To 
avoid this you need to cast the character value to unsigned, before the 
conversion occurs. 
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Avoiding Dependencies on Number Representation 


You need to take care to avoid making assumptions about how numbers are 
represented, especially negative integers. There are two forms for 
representing negative integers that you may come across. 





The first form simply reserves the leading bit in an integer value as a sign 
bit. For positive values the sign bit is 0 and for negative values the sign bit 
is 1. The data bits for a number of a given magnitude are the same, the 
sign bit determining whether a number is positive or negative. 
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The second form for representing negative integers we have already looked 

at - the two's complement form - and this is rather more common because it 
simplifies the hardware for integer arithmetic. The two's complement of an 

integer value is produced by flipping all the bits - a 1 bit is replaced by 0, 

and a 0 bit is replaced by 1 - and adding 1 to the result. For example, the 

value +8 as a 16 bit binary number is: 


0000 0000 0000 1000 


To obtain the two's complement representation for -8, we just flip the bits 
producing: 


Wit 101 HH 0111 
and then we add 1, giving the final result: 
1111 1111 1111 1000 


You can verify that this is indeed -8, by adding the binary representation of 
+12, which is: 


0000 0000 0000 1100 
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If you try it you will get the binary equivalent of 4. 


When a negative integer value in two's complement form is shifted right, 
the sign bit is propagated, so shifting -8 two positions to the right produces: 


1111 1111 1111 1110 
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This is -2 in decimal form, so propagating the sign in a right shift produces 
a result equivalent to dividing by 2 for each position shifted. In our 
example, we shifted right by two bits so we effectively divided -8 by 4, 
giving the result -2. 


While many machines use the two’s complement representation for negative 
integers, this isn’t universal, so you need to avoid relying on the behavior 
of signed integers when shifted right. You can’t assume a right shift 
produces the equivalent of a divide operation, and you should only use the 
right shift operator when it’s protected from the effects inherent in two’s 
complement representation. 


You can avoid two’s complement dependency by ensuring that shift 
operations are only applied to unsigned integer values. One way to do this 


is to always cast the value to be shifted to an unsigned type. For example: 





The cast of the variable Mask to unsigned long ensures that the sign isn't 
propagated in the shift operation. 


If your code is going to run in a mainframe environment, you must be 
aware that on some machines floating point numbers are stored with a 
hexadecimal base. As a consequence, a single precision floating point value 
with a normalized 24-bit mantissa can have three leading zeros, since the 
normalization only requires that the leading hexadecimal digit is non-zero. 
This can be a problem with some numerical calculations where you're 
dependent on full 24-bit accuracy. The solution in such an instance is to 
change the type from float to double. 


Avoiding Range Dependencies 


There can be significant variations in the range of values supported by the 
different types of variables in C. The ANSI C standard does provide the 
following specification for the ranges guaranteed to be supported in a 
conforming compiler: 


char 0 to 127 
unsigned char Ü to 255 

int -32,767 to 432,767 
unsigned int O to 65,535 
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long -2,147,483,647 to +2,147,483,647 
unsigned long 0 to 4,294,967,295 

float exponent -38 to +38 

float mantissa 6 decimal digits 


Note the lower limits for variables of type long and int. Most machines 

use two’s complement arithmetic which will allow negative numbers that 

are one less than the lower limits shown here. However, not all machines 
use the two’s complement representation for negative numbers - hence the 
higher values for the lower limits of these ranges. 


The range guaranteed for values of type float will be from .999999E-38 to 
.999999E+38, either positive or negative. 


If you can write your program such that values for various types don’t fall 
outside these ranges, then values in your program won't cause you a 
problem. Of course, the range of numbers you need is usually application- 
dependent, particularly when it comes to floating point values, and you may 
not be able to limit yourself to the above ranges. However, most systems 
will provide ranges of floating point values that comfortably exceed those 
shown. 


Actual Limits for Numeric Values 
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It’s possible that you may want to set a variable to the maximum or 
minimum possible for whatever type it happens to be. The standard header 
file LImITS.H defines identifiers which represent the upper and lower limits 
for each integer type on your particular machine and compiler. The 
definitions in LIMITS.H include: 


SCHAR_MIN Minimum value for type signed char 
SCHAR_MAX Maximum value for type signed char 
CHAR_MIN Minimum value for type char 

CHAR_MAX Maximum value for type char 
UCHAR_MAX Maximum value for type unsigned char 
INT_MIN Minimum value for type int 

INT_MAX Maximum value for type int 

UINT_MAX Maximum value for type unsigned int 
LONG_MIN Minimum value for type long 











LONG MAX Maximum value for type 1ong 
ULONG MAX Maximum value for type unsigned long 


The header file FLOAT.H defines values relating to floating point operations. 
Many of these are rather specialized, but you may have need of: 


FLT MIN Minimum value of type float 
FLT MAX Maximum value of type float 
DBL MIN Minimum value of type double 
DBL MAX Maximum value of type double 


By using these definitions for limits rather than explicit numeric values, 
your values will be trapped if they fall outside the limits specified by the 
machine and compiler in use. 


Variations in Word Sizes 


Variations in word sizes are responsible for the variations in the ranges of 
values that the various types in C can deal with. They also affect memory 
addressing. A variable of type int must be at least two bytes, but it can be 
more, so if you want to maximize portability, you mustn't write programs 
that make any assumptions about how much memory your variables occupy. 
This means always using the operator sizeof when you need to know how 
much memory a particular object in your program requires. 


You can use typedef to give yourself some flexibility when you expect to 
be moving your program between different computers. You can redefine the 
basic types in C with statements at global scope such as: 


typedef int INT; 

typedef unsigned int UINT; 
typedef long LONG; 

typedef unsigned long ULONG; 


... 


and then write your program in terms of your own type definitions. If you 
are then constrained by a specific machine environment, you have the 
possibility to change the definition of a type on a global basis. This 
technique is usually used to choose between the integer types available in a 
particular environment, but it can also be used for floating point types. 


349 


Chapter 10 - Portability and Maintainability 


The size of a pointer shouldn't normally create problems, but there is a 
potential problem associated with taking the difference between two 
pointers. The difference is usually of type int, but on some machines which 
allow you to create very large arrays or structures it can be 1ong. You can 
avoid such complications by using a variable of type ptrdiff t to store the 
difference between two pointers. This type is defined in the standard header 
file STDDEF.H, and will automatically accommodate whatever result you get 
in any machine environment. 





Storage Alignment 


We saw in our discussion on structures that some processors require 
variables to be stored at an address that is a multiple of their size, and that 
this can cause a structure to occupy more memory than the minimum 
necessary to hold the members of the structure. This alignment is primarily 
needed so that data can be moved efficiently between the main memory of 
a machine and the processor registers, and is usually determined by the 
width of the data bus. The following diagram shows how the same 
structure can occupy a different amount of memory as a result of a different 
boundary alignment for variables: 


struct MyStruct 

{ 
Inta; /* 2 bytes */ 
long b; /* 4 bytes */ 
charc; /* 1 byte */ 
long d; /* 4 bytes */ 

} aVariable; 





2 Byte Alignment 


This demonstrates quite clearly why you must always use the sizeof 
operator to determine how much memory is required for a particular object 
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Linker 





Problems 





- not only can the memory for a structure be greater than the sum of the 
memory required for its individual members, but it can also vary from one 
machine to another. 


Avoiding System Environment 
Dependencies 


The operating system environment can create barriers to portability, and 
although these can be serious in some environments, if you must write code 
for these you will just have to live with it. Microsoft Windows, for example, 
assumes control of all communications with the user, and this has a 
profound effect on program structure as well as code for handling the user 
interface. In such cases, it’s virtually impossible to avoid built-in barriers to 
portability, and the best you can do is to try to isolate some parts of your 
code in order to allow their reuse in a different context. However, in other 
contexts there are a couple of things which can cause problems but which 
are avoidable. 


UNIX low-level file input and output functions are often used with C, 
largely on the grounds of efficiency, but if you want your program to be 
portable you'll need to avoid these, because they're not available on non- 
UNIX systems, and often vary between versions of UNIX from different 
suppliers. You should be able to do what you want with files using the 
standard library functions for file I/O. 


Another avoidable dependency is the third parameter to main() for passing 
details of environment variables. This isn’t part of the C standard and isn’t 
supported on some systems, so if you must have portability, avoid this. It’s 
usually supported under UNIX, and some compilers also support it for IBM 
compatible PCs. | 


Linker Problems 


Although it may not be immediately obvious, the linker for object modules 
generated by a C compiler can seriously restrict some aspects of your 
source programs. We've already seen that the linker may restrict external 
names to six characters, and in addition, some other linkers are unable to 
differentiate between upper- and lower- case text. This means that if your 
program must be portable to such environments, then you can't rely solely 
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on differences in case to differentiate your external identifiers. You must 
differentiate them by ensuring that each is composed of a unique sequence 
of characters. 


This effectively removes case sensitivity so far as external identifiers are 
concerned. While this is often stated as a requirement for portable C, it 
tends to undermine the fact that C provides case sensitivity. For this reason, 
it's best to ignore it unless there is a clear need to port to a machine with 
such an unfortunate disability. 


Easing Program Maintenance 


The need for program maintenance arises either because a bug in the 
program needs to be fixed, or because an extension to the program is 
required. In either case, being able to readily understand the logic and 
structure of the program is essential for effective and efficient program 
maintenance. This will be determined by how well the program is 
documented, and by how clearly written and well structured the program 
is. It also depends on whether the program has been modified previously, 
and how well that was done. 


Good documentation, and a well-structured program become particularly 
important when the person undertaking the program maintenance isn't the 
original author of the program. Items that contribute to making program 
maintenance a predictable activity come under the heading of programming 
standards in professional programming circles. This is a large topic for 
which you will find a number of excellent books available, so here we'll be 
looking at just a few of the elementary considerations. 


Programming Style 
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A good clear programming style is fundamental to making program 
maintenance easy, and it also helps you to write good code by enforcing a 
discipline. There's a lot of debate about which style of code presentation is 
the best, but if you adopt a clear style, and apply it consistently, you can't 
go far wrong. Perhaps the aspect of C-programming style which has the 
most immediate impact is how the code is indented, providing visual cues 
to the logic of a program. There are three primary approaches to this that 
we can illustrate with a fragment of code from an example in Chapter 8: 


Program 


Maintenance 





d. Aligned braces and indented text | 








WE Indented and aligned braces and text 
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All of these are in common use and which you prefer is purely a matter of 
taste. The third version is the most compact, but the first version is perhaps 
the best because the block structure seems easier to see, and there appears 
to be less chance of accidentally omitting a brace somewhere. The most 
important consideration is that you choose one style and stick to it 
throughout your programs. 


Program Documentation 


You should create program documentation as part of the process of 
developing a program. There are two distinct and complementary styles of 
documenting your program, and you must get into the habit of using both. 


Commenting 


This can be a difficult discipline to get into, but the very least you should 
do is ensure that your program is comprehensively explained with 
comments once it seems to be working. This needs to be done at two 
levels. You need to document statements within each function as to their 
meaning, intended use, and operation, and you also need to provide 
comments at the beginning of every function that explain its use, the 
significance of the parameters, and any other important features of its 
operation. You should also standardize the appearance of the comments at a 
function level as far as you can. 


Let's look at a typical approach to documenting a function: 
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As with the indenting of statements, the precise form of comments in your 
program is relatively unimportant, as long as they're clear and consistent. A 
uniform and easily understood presentation is the objective. Within the 
comments documenting each function, you should at least explain the 
purpose, the return value, and each of the specified arguments. 


External Documents 


For a program of any size beyond the trivial, you can't assume that the 
program is self-documenting, no matter how good the comments are You 
need to have external documentation providing an overview of the program 
structure, the functions that go to make up the program, special techniques 
used in the program, working descriptions of the functions in the program, 
external libraries used, and so on. Generally, the amount of additional 
program documentation that is necessary, increases at least in proportion to 
the size of the program. 


Program Structure 


A program with a well thought out structure, organized into sensibly-sized 
file units, is very much easier to maintain than one that isn't. There are two 
sides to structuring a program: the segmentation of the code into functions, 
and the packaging of the program into files. There are some rough guides 
to sizing that you can apply here. 
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Splitting Functions 


Functions in C should be small units of code with a well-defined purpose. 
If you find that you're writing functions with more that 40 to 50 lines of 
code, then in most instances you should be able to subdivide them into 
smaller and simpler units. Of course, there are no absolutes here, and there 
is always the exception to the rule. You can typically identify three kinds of 
functions used in your programs: 


@ Standard library functions 
MID General purpose functions that you use in multiple applications 


MID Application-specific functions 


Creating your Own Libraries 


General purpose functions can conveniently be packaged in separate files 
incorporating all the definitions required by those functions. They become 
essentially your own extensions to the standard library. 


Protecting Program and Header Files 


We saw how to use pre-processor directives to protect against duplication of 
functions or definitions in a program, and it's generally good practice 
always to protect program and header files in this way, even when it may 
not be necessary in a particular application context. You need to keep your 
program and header files within size limits that you find comfortable to 
work with, and usually the functions that make up your program will fall 
into suitable groups. Most people are happy working with files containing 
up to four or five hundred lines of code, but naturally they can be much 
smaller, or even larger, if the structure of the application dictates. 


Segmenting Files 
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A good segmentation of a program into files will give you several 
advantages. In most environments, you only need to recompile those files 
that you've altered at any point, which can save a lot of compilation time 
when you're developing a large program. Editing a file of a modest size is 


Defensive 





Programming 





relatively easy, since you can keep in mind what it contains, and maintain a 
feel for where various functions appear in the file. If the files represented 
fairly self-contained groups of functions, you will minimize the need for 
fiddling with two or more files when making changes to the program. 


Defensive Programming 


As we have gone through the various aspects of the language, we’ve seen a 
number of things that provide some protection against errors, or enable you 
to detect errors more easily. None of these things are essential to a program, 
but by adopting such approaches you make your programs more secure, 
easier to follow, and less prone to errors. This has to be good medicine, 
since programming is still an inexact science, and we need all the help we 
can get. All of the things we are talking about here could be grouped under 
defensive programming. Let’s review some of the most significant techniques. 


Initializing Variables 


You should initialize variables in their declarations wherever practicable. If 
the majority of your variables have known values from the beginning, it 
becomes less likely that you'll inadvertently use a variable that hasn't been 
properly set. You should choose the values judiciously - scattering zeros 
around is fine in general, but you can often choose an initial value that will 
throw up an error if something doesn't get set properly. You should do this 
whenever you can. 


Naming Conventions 


You should choose meaningful names for variables rather than using single 
characters. Single character names can be tempting because they minimize 
the amount of keying needed to enter your program. However, it's likely 
that in most cases, two days hence you won't remember what they are. 
There are some exceptions in specific contexts. 


One exception perhaps is for loop counters, where variables such as i, j, 
or k are frequently used. Another is for some parameters to mathematical 
or geometric functions, where x is commonly used to represent an 
independent variable, and x and y are normally understood to be point co- 
ordinates. Where there is any possibility that a variable name might not be 
clear, you should choose a suitable identifier to make it obvious what is 
meant. 
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Adopting a convention for naming some types of variables can make your 
programs much more readily understood. You don't need to go as far as 
the full Hungarian notation but adopting a few conventions, such as 
beginning pointers with p and capitalizing identifiers specified in a #define 
directive, can be a tremendous help. As we saw when we first looked at 
variables, you shouldn't use names that begin with an underscore, and you 
should avoid names beginning with two underscores, just be on the safe 
side. This will avoid confusion with internal names defined in the standard 
header files. 


Applying Constants 


Constants should be constants. Where you define a constant such as: 





there is nothing to prevent you changing the initializing string. However, the 
string is a constant and shouldn't be modified. By using the const modifier, 
you can make the string more secure: 





Your compiler now knows that the string pointed to by pstr is a constant. 
Your compiler will therefore give you an error message if you make any 
direct attempt to alter the string. The pointer has been defined as a pointer 
to a constant string, so although you're free to change the pointer to contain 
another address, it can't be used to alter the object it points to. Of course, 
this doesn't prevent you circumventing this in a variety of sneaky ways, but 
at least one wall is in place to protect the constant string. Don't forget that 
you can declare any kind of variable as const, so the opportunities to 
reduce errors in your program are legion. 


Magic Numbers 
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You should avoid using ‘magic numbers’ in your program. A magic number 
is a number that appears out of thin air without any obvious indication of 
what it is, where it came from or what it represents. For example, in the 
code fragment: 














Magic 


Numbers 





the value 50 controlling the loop is a magic number. Where did it come 
from? Presumably it represents the size of the array, in which case it should 
have been declared as: 





Now we know what is controlling the loop. As well as making the program 
easier to understand - this technique also gives you the flexibility to make 
global changes to array sizes, and have all the program code adjust 
automatically. 


Parentheses in Expressions 


If you know the precedence of all the operators in C then you'll know 
precisely where parentheses are necessary, and where they aren't. If you are 
at all in doubt, put the parentheses in, and you will always be correct. 


One area where it's as well to use parentheses habitually is with expressions 
involving bitwise operators. This is for two reasons: firstly, because they're 
used relatively infrequently, and secondly, because they're a little weird and 
sometimes counter-intuitive, and therefore hard to remember. We can show 
this with a couple of examples. If you have an if statement such as: 


Minim rs | 
X.* DO something * 





it means what you expect it to mean. First, a+b is calculated, and if the 
result isn't equal to zero then the if condition is True. Contrast this with 
the statement: 


if(a&bi«c) | | 
/* Do something */ 


Because the precedence of & (and the other bitwise operators) is lower than 
the operator !=, the expression b!=0 is evaluated first, and the result is 
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combined with a using the bitwise AND operator. Thus, to get them to 
work as you would typically want them to, bitwise expressions need to be 
parenthesized when appearing in a test expression. We can write: 





Now we'll AND a and b together, and then test whether the result is zero. 


Pointers 


Using pointers containing invalid addresses is likely to be a major 
contributor to bugs in your programs. Of course, initializing pointers to 
NULL when you declare them is an important defense mechanism, but you 
should also try to ensure that whenever the address contained in a pointer 
becomes invalid, then it's reset to NULL. This occurs most frequently when 
freeing memory you have allocated on the heap, and you can relieve 
yourself of the burden of having to remember to reset a pointer to NULL 
whenever you free some memory on the heap, by defining a macro as 
follows: 





If you now use the macro FREE() instead of the function free(), you will 
automatically reset the pointer to NULL on each call. Of course, you still 
need to deal with other pointers that you may have set to the same 
address. 


Pointers passed as arguments to a function are always a risk, because they 
provide a license to alter variables in the calling program. Where the 
intention is to provide access on a read only basis, you should use the 
const modifier in the function parameter definition. For example, a function 
to count the characters in a string could have the prototype: 





This will ensure that there's no inadvertent alteration of the location pointed 
to by the passed pointer argument. 
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Using Functions 


When implementing a function, it's a good idea to check that any argument 
passed to it falls within the range you're expecting. If a pointer is passed, 
check that it isn't NULL, and if a parameter is a person's height in inches, 
values less than 10 or greater than 100 are likely to be erroneous. 


You should also take care to check the status return after calling a function, 
since error conditions are indicated this way. It's all too easy to assume that 
everything is correct, when in many circumstances, it won't be. This is 
particularly true of dynamic memory allocation, and operations on files. 


Writing Macros 


The most important thing to remember about macros is that they aren't 
functions. They're just a recipe for a blind substitution. There's no checking 
whatsoever by the pre-processor on the effect or the appropriateness of the 
substitutions made when a macro is used. You should always put 
parentheses around the parameters in the macro definition as well as the 
entire substitution string, to protect against errors caused by operator 
precedence. For example: 





This ensures that statements such as: 





produce the answer you are looking for. You might think parentheses 
around the appearances of the parameter in the substitution string would be 
sufficient, but if you omit the parentheses around the whole thing, then the 
statement: 





wouldn't produce the correct answer. 
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Diagnostics 


Use the assert() macro liberally. It doesn't cost anything other than the 
time it takes to key in the statements - plus the time to get the typos out. 
You should also develop your own favorite message and fix up mechanism 
that you can switch on and off, which we discussed in the last chapter. In 
the early days of developing a new program, just having checks for such 
things as the array indexes being within bounds can save immense amounts 
of time. They will slow up your program, but as soon as you're reasonably 
sure that you're free of that particular kind of problem, you can switch 
them off. 





You can leave statements using assert(), or other diagnostics controlled by 
identifiers appearing in #define directives, in your program permanently. 
They can be switched off for production versions of the code, and switched 
on when the need to make changes to the program arises. Naturally, you 
will need to document your own mechanism for diagnostics when you 
include them in a program. Otherwise, you will be working out what they 
all are once more, 6 months or a year later. 


We have frequently omitted excessive error checking in examples in this 
book, due to pressures of space, and to avoid the examples involving more 
code than reasonably necessary to demonstrate the topic up for discussion. 
Practically, you should never do so. Always check error returns from all 
standard library functions that provide them. Errors crop up when you least 
expect them, and without the error checking in place, you'll waste an 
inordinate amount of time trying to find them. 


Summary 


To avoid unnecessary constraints to moving your programs between 
different machines, you need to adopt a few simple rules for code 
development: 


MID Stick to ANSI C and the standard library. 


MD Use symbols defined in the standard library rather than explicit 
constants. 


| 
| 
1 
: 
| 
| 
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SED Use the types defined in the standard library where necessary. These 
include size t for values returned by the operator sizeof, fpos t for 
positions in a file returned by the standard library function fgetpos(), 
and ptrdiff t for the difference between two pointers. 


Avoid programming techniques that depend upon specific codes for 
characters. 


Assume only the minimum ranges of values for variable types. 


Always use the operator sizeof to obtain the memory space occupied 
by a variable. 


You can make your programs easier to maintain and debug by adopting a 
clear and consistent programming style, and including comprehensive 
comments. By making sure that your program and header files are well 
structured and consistent, as well as including diagnostic code in your 
source, you will make tracing problems a lot easier, and the job of 
extending a program a simpler and thankfully more predictable task. 
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Developing a Program in C 


In this chapter we will write a program that is considerably larger than any 
of the other examples in this book, giving us an opportunity to see how 
some of the language features and techniques we have discussed, can be 
applied in a practical context. 


The example has been chosen with several considerations in mind. First, it 
is compact enough to be worked through in a single chapter. This inevitably 
means some simplifying assumptions, but this in itself will provide a base 
for you to exercise your skills further in elaborating the example. Second, it 
needs to combine a reasonable spectrum of C capability, including file 
operations, and the way it has been implemented here is aimed at that. 
Third, it should be a simple application that doesn't involve a lot of 
technicality or complicated mathematics. 


One application which fits these criteria is creating and maintaining a 
personal address file. I hope you enjoy it. 
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Defining the Problem 


The starting point for any programming task lies in deciding what it is you 
are trying to do. Our program will provide a file containing basic 
information on acquaintances, friends, or even enemies. We will provide for 
the name, the address, and the telephone number of each individual, but 
the initial implementation of the program shouldn't inhibit expansion to 
include other details in the future. 


The operations that the program will support are: 


MD Adding a record for a person. 
WE Deleting a record for a person. 


WE Listing the complete contents of the file in ascending alphabetical 
order, using the surname as the sort field. 


MD Searching the file for a particular entry. 


This is a very simple application that is easy to understand, but as we shall 
see, it isn’t completely trivial. 


Structuring the Program 
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Even though the program is quite modest, it would be easier to manage as 
several files. We will need a standard header file into which we can put all 
the standard definitions. This will contain all the definitions specific to the 
program as well as the include files for the standard library functions that 
are used. Of course, this file would need to be included at the beginning of 
each of the source files that go to make up the program. 


This will still leave quite a large source file which at this point doesn’t 
seem to fall into particularly obvious functional groups, but it would be 
easier to manage if it were divided up in some way. One approach would 
be to group the lower-level service functions into one source file, and leave 
main() and the functions it calls directly, together in another. 


Name 


Structure 





However the source code is to be divided up, you need to remember to 
put an include statement for the program header at the beginning of each 
of the source files. You will also need to put extern declarations for any 
global variables that are used but not defined in a source file, as you can 
have only one definition for such a variable. If global variables are defined 
in the source file containing main(), which is the typical approach, then the 
source file containing the secondary functions will need to have an extern 
statement for each global variable that is used. 


Managing the Application Data 


We should decide from the outset that we will have a fixed-size data 
record. This will simplify operations with the file, and having a fixed record 
size, we will be able to program read and write operations on the file so 
that they won't need changing if the record size is subsequently altered. 


The Name Structure 


The name of a person can be tricky in a number of ways. We will assume 
that a person's name is split into two parts, a surname and a first name, 
and both of these will be alphabetic - this means no hyphenated names for 
instance. This isn't as flexible as you might like but if we package the name 
in a structure, it should be no problem to expand it later. We can define the 
structure as a type, as follows: 





Since it's defined using typedef, we can declare objects of type Name 
without having to use the struct keyword. The length of each of the 
members of type Name are flexible, and assume this definition is preceded 
by a definition of the array length, such as: 
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The name structure can easily be expanded later if you want. In order not 
to make the program overly long, we will limit ourselves to comparing 
surnames for sequencing file records, but you could easily extend this by 
providing a special function for comparing full names. 


The Person Structure 


We can define the structure to contain the complete data record on an 
individual, and we'll need to decide how we are going to relate these 
records. Since we want to search the 'person' file, a valid solution would be 
to maintain this file as an ordered linked list. We can also include backward 
pointers to allow for future operations, but we'll only use forward 
processing of the list from the beginning in our first attempt. We can define 
a record as: 





As with the case of the previous structure, we've used typedef, so we can 
use the type Person without the keyword struct. The members Previous 
and Next store the file pointers for the preceding and following records in 
the file. The Deleted member is a flag that will be set to 1 when a record 
is to be deleted. This will allow us to detect and overwrite deleted records 
when new ones are added, enabling us to use up unoccupied holes in the 

file, and avoid the need to clean it up or compress it. 


The Address[] member is a two-dimensional array, where the dimensions 
will be defined by directives such as: 





These would provide for 5 address lines, with up to 40 characters each. 
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File 





The members the program will interact with are the Name member, the 
Previous and Next members, and the Deleted flag. You could add other 
members without affecting the program very much. The Address[] member 
is just baggage. 


The Person File 


Each read or write operation on the file will transfer one Person object. 
Since these objects will be linked in alphabetical name order using file 
position pointers, we will process the file randomly. A typical structure for 
the file is illustrated here: 


Record 1 Record 2 Record 3 Record 4 









Record 5 Record 6 Record 7 





This shows a file containing seven records. The first record will always be 
the first in the alphabetical sequence, but the following records can be 
located anywhere A record will be written in the next available space, 
except when it's the first in sequence, in which case we will arrange to 
move the previous first record to a vacant location somewhere else in the 
file. 


369 


Chapter 11 - Developing a Program in C 


The Next member always points to the next record in sequence, or has the 
value -1 if there isn't a next record. We will define each pointer relative to 
the beginning of the file, so valid pointers are always greater than or equal 
to zero. The Previous member similarly points to the previous record, and 
the first record will have its Previous member set to -1. 


Deleted records have the Deleted flag set to 1, while for a normal valid 
record the Deleted flag will be zero. The deleted records 2, 3, and 6 are 
shown greyed out in the diagram. 


General Program Logic 
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We've defined a range of possible operations to be supported by the 
program. An easy way of implementing this is to provide a menu, like the 
one below, which prompts the user to enter a letter in order to select an 
option: 


Enter a character to select an option: 


Add a person to the file 
Delete a person from the file 
List the file contents 

Search for data for a person 
Quit - end the program 


ONru> 


With the Quit option as part of the menu of choices, we can drive the 
program in an infinite loop from main(), with the character entered 
selecting a particular function to be called. A switch statement will do this 
very nicely. We could write the function main() incorporating this approach 
immediately: 
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This barely needs explaining. The input here uses scanf() with the format 
specifier *1s which will read the first non-blank character into ‘ch’. Note 
that ch will point to a string of length 2. We could use *c, but this would 
rely on no blanks being entered before the character defining the selection 
from the range of options. Each option will work when either an upper or 
lower case letter is entered. The default case in the switch is there to catch 
incorrect entries. The function main() will cycle indefinitely until ‘Q’ or 
'q' is entered to end the program, so that you can iterate round trying all 
sorts of combinations and sequences. 


Each operation supported by the program is packaged into a separate 
function. To extend the program to support additional functionality, we just 
require additional options in the menu, correspondingly reflected by cases 
in the switch statement. To complete this version of the program all we 
need to do is add four functions, AddPerson(), DeletePerson(), Search(), 
and ListFile(). 
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Adding a Person to the File 


372 


We have no menu option to create a file, so the operation to add a person 

will need to create a file if it doesn't exist. We can specify the file name to 

be used in a global variable at the beginning of the source code, which will 
allow the file name to be changed by altering the initial value for the global 
variable. We would need a statement such as: 





This defines the file name as PERSONS. If you are going to compile and run 
this example, you may want to change this to reflect the needs of your 
environment, perhaps with a full path specification. 


We can see from the code for the function main() that the function 
AddPerson() is completely self-contained. It receives no arguments and 
returns nothing so its prototype is going to be: 

We should consider what the basic logic for adding a new person to the file 
is going to be. First of all, we must read in the data for a person and 
create a Person structure, and we could assume that we'll do this in a 
function ReadPerson(). We have a choice here: we could create the Person 
object dynamically and return a pointer to it, or we could have the calling 
program pass a pointer to an object and have the function fill in the data 
members with input values. In the interests of using the library function 
malloc(), let's go for the first option. We will need to remember that the 


calling program will need to release the memory for the Person object 
when it's no longer required. The prototype for ReadPerson() will be: 





Having read the data and created a Person object, we'll need to open the 
file if it exists. If it doesn't exist, then we want to create it, so we need to 
open it in the first instance with a mode that will do this for us, either 
‘append’ mode or ‘write’ mode. Once we have established whether the file 
exists, there are basically two possible courses of action, depending on the 
state of the file. We can write the record directly if the file is empty, but if 
it contains information, we will need to go into a record insert operation, 
which is likely to be a little complicated. 





———————————— 


—— e—— M 


Adding a 


Person 





The general logic for the function AddPerson() is shown here: 


Start 





The first action is to call the function ReadPerson(), and the file is then 
opened in append mode, so that if it doesn’t exist, then it will be created 
anyway. The choice of append mode rather than write mode is determined 
on the basis that write mode would allow the file to be overwritten, 
whereas append mode only allows the file to be added to, even if the mode 
ab+ is specified. 
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If the file is empty then we need to write the Person object to the file, and 
since all our write operations will be identical, it would be a good idea to 
introduce a function to do this, which we will call WwriteFile(). Then we 
can encapsulate error checking into the function, so we won't have to repeat 
it every time we write a record. 


If the file isn't empty, we will insert the Person object at the appropriate 
point which will need another function we can call Insert(). We've now 
added two further functions to our program and can write their prototypes 
as: 


The function WriteFile() receives a pointer to the Person object to be 
written, and it will assume that the file is open and positioned at the place 
where the record is to be written. It has no return value. We will check for 
a write error in the function, but all we will do if it occurs, is to report it 
and bale out of the program. 


Similarly, unless there is a catastrophic problem, like running out of disk 
capacity, we should always be able to insert a new record into the file, so 
the Insert() function has no return value. It has one argument which is a 
pointer to the Person object to be inserted. It will need to sort out where 
to write the record in the file and to make all the necessary connections to 
other records. 


The Code for AddPerson() 


Given these functions, we can put together the code for the function 
AddPerson(): 
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The function implementation assumes a global variable rsFile which is set 
to 1 once it's established that the file exists. This will avoid checking for the 
file's existence every time we add a new Person record. The logic for 
checking rsFile wasn't shown in the general logic since it's not an 
essential element in the operation, just an added convenience. 


The first time through the function the flag rsFile will be at its initial 
value 0, so we will check for the existence of the file by opening it in 
append mode: 


Since this will either open the file or create it, IsFile is set to 1 and will 
remain so for as long as the program is running. Thus all subsequent add 
operations won't bother to check for the existence of the file. 


Each attempt to open the file is verified by checking for a valid pointer 
returned from fopen(). It’s most important to check that the open 
operation works because there are so many ways programming errors can 
cause failure, and trying to use an invalid pointer will surely crash the 
program. 


The AddPerson() function is one big infinite for loop. This will allow you 
to add multiple records to the file without going back to the menu. Having 
to enter a new selection for each addition to the file can be a little bit 
tedious, especially when you're initially setting up the file. 


When the record has been written, the heap memory is released, and the 
file is closed. To free the heap memory we use a macro FREE() which is 
defined as: 


By using this macro, we ensure that the pointer to the memory area is 
always set to NULL when the associated memory is deleted. This is a good 
safety precaution and will prevent us from accidentally accessing heap 
memory that has been freed, and may be allocated to something else. 


Creating 


a Person 





After releasing the memory, there's then a local choice for the user as to 
whether another add operation is required. If not the program returns to 
the main menu in the function main(). 


Creating a Person 


The code for the function ReadPerson() is: 
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After using the standard library function malloc() to storea new Person 
object in memory, the function GetName() is called to read the data into the 
Name member of the Person object. The GetName() function will have the 


prototype: 


The function will return 1 if a valid Name object has been created, or 0 if 
otherwise. Having the Name object read by a separate function allows you to 
make the name handling more sophisticated without affecting the rest of the 
program. 


Once the ReadPerson() function has obtained a valid Name object, up to 
ADDRLINES address lines are read into the member Address[]. An empty 
line being entered will terminate the process. Lastly the phone number is 
read as a character string without checking, because, since all we do is 
display it, we don't need to worry about its validity. 


Reading a Name 


The code for the function to read in the data for a Name object is: 





Reading 





This is quite straightforward. The only checking done is to ensure that the 
names are alphabetic, since spurious characters could upset the compare 
operations we will be carrying out later. The checking is done by the 
function CheckName() with code as follows: 





This function simply checks that each character of a string passed to it is 
alphabetic. The check is made using the standard library function isalpha() 
declared in the header file cTvPE.H. If the name is valid, 1 is returned, O is 
returned if otherwise. This function could be expanded to accommodate 
hyphens, or allow names with embedded blanks such as Mary Lou 
Creighton-Featherstone, for example. The code to compare names we will 
get to later, but it would need to deal with whatever extended name was 
allowed if you modify the CheckName() function here. 


ting a Record 


The code for the function WriteFile() to write a Person object to the file 
is as follows: 
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This uses the standard library function fwrite() to write the object to the 
file. If an error occurs with the operation, a message is displayed on 
stderr, the file is closed, and the program is exited. 


Inserting a Person into the File 
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The Insert() function will be the most complex part of the operation to 
add a person to the file. The Insert() function is only called if there's at 
least one record in the file. There are three possible situations that we need 
to deal with in the function: 


@ Adding a Person record at the beginning of the file. 
"VM Inserting a Person record between two existing records in the file. 
"E Adding a Person record to the end of the file. 


We also need to take account of the possibility that all the file records have 
been deleted. This will be indicated if the first record has the Deleted flag 
set to 1. We can understand the general logic of the Insert() function 
using this block diagram: 





Inserting 


a Person 





Start 





The function reads the file starting with the first record. If the first record 
has the Deleted flag set, then the file is empty, and the new Person 
becomes the first record. If the file isn’t empty, then the Name for the new 
Person is compared to the.name of the first record read. 
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If the Name for the new Person is less than that of the current record, then 
we need to insert the new Person preceding that record. If the current 
record is the first, then we need to insert the new Person as the first 
record, which means that the current record must be moved somewhere 
else, and the following record needs to have its Previous pointer updated 
to reflect the new position of the current record. 


If the current record isn't the first, then we must be inserting the new 
person between two existing records. This involves finding a place for the 
new Person in the file after setting its Next and Previous pointers, and 
updating the Previous pointer for the current record, and the Next pointer 
of its predecessor. 


If the Name for the new Person isn't less than that of the current record, 
then we read the next record and repeat the process. If the current record is 
the last then we add the new Person to the end of the file. 


Note that our version of the program will only consider surnames for 
sorting purposes, and will assume that there aren't any duplicates. 


Using the logic from the previous diagram, we can write the code for the 
Insert() function: 
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After setting the file position back to the start, all the work is done in the 
infinite for loop. When a record is read from the file we check whether it 
is the first record with the Deleted flag set. If so our problems are solved 
since the new Person object will be the first and only record in the file, so 
we just back up to the start and write it, and thus we're finished. Note that 
we don't read the file directly - we use the function Read() to do it for us. 
This allows us to check for and handle read errors within the function, so 
we don't need to worry about them externally. 
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If the first record isn't deleted, we compare the Surname member of the 
Name member of the new Person object with that for the record just read. 
The compare uses a macro which can be defined as: 





This macro tests if the first string argument is less than the second, and 
returns True if it is. It’s a question of personal preference as to whether you 
like to use the macro, or prefer to use the library function stremp() 
directly. 


Insertion Strategy 
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If the new surname is less, then we need to add the new person to the file 
with pointers set so that it precedes the record just read. If the record just 
read is the first record in the file, then we need to move it so that we can 
write the new Person as the first record, which is done by the function 
AddFirst(). If the current record isn't the first, then we must be adding 
the new Person object between the current record and its predecessor, and 
the function AddMiddle() achieves this. 


If the surname for the new Person object isn't less than that of the current 
record, we need to check whether the current record is the last in the file, 
which will be indicated by a value of -1 for the Next pointer. In this case 
we need to add the new Person in an available space in the file (which 
may turn out to be the end), and update the Next pointer for the record 
we read from the file with the position of the new Person object. The 
function Write() searches for a vacant space for a new Person in the file, 
and if there's no embedded deleted record, then it will append the new 
record on the end. 


If we're not at the last record in the file, then the file position is moved to 
where the next record in sequence is to be found. Its position is stored in 
the Next pointer of the current record. We then cycle through the loop 
again. 


The code for adding a record at the start or in the middle of the file could 
have been included in the Insert() function, but it would have made the 
function rather long and somewhat difficult to follow. 


Reading 


a Record 





Reading a Record 


The code for the function to read a record from the file is as follows: 





The basic service performed here is to read a record into the structure 
pointed to by the function argument, and return its position in the file. The 
rest of the code is for error checking. The first check is for end of file, and 
if this is detected then -1 is returned. The second check is for a read error, 
and if this occurs a message is displayed on stderr by calling perror(), 


| 
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and the operation will be retried up to MAXERR times. If MAXERR successive 
read errors occur, then the program is terminated. 


For most disk devices, a read error is a serious error, and you wouldn't 
typically try to read the record again, but simply end the program after an 
error message. If MAXERR is set to 1 then this is how the function will work. 
With other types of magnetic storage which are open to the air, a read error 
can be caused by dust on the medium, which is sometimes dislodged by 
backing up and trying the read operation again. 


Adding a Record to the Start of the File 
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Adding a record to the start is a little messy because the first record in the 
file must always be the first in sequence. As a consequence it affects not 
only the first record, but the next in sequence, too. The effect on the 
existing file members is illustrated here: 






New Record 





tile 


After setting the Previous pointer for the first record to 0, it’s moved to 
the first free space in the file. The position is then recorded in the Next 
pointer of the new record. The record following the former first record is 
retrieved, and its Previous pointer altered to reflect the new position of the 


former first record. It can then be written back in the same place in the file. 


Adding at 


the Start 





Finally the new record can then be written as the first record in the file. 
The code for this is: 





| 


Adding to the Middle of the File 


Adding a record to the middle of the file is a lot simpler than adding to 
the beginning, insofar as we don't need to move any existing records. This 
diagram illustrates what we need to do: 
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| New Record 





The old pointers that need to be changed are shown with crosses, and the 
new connections are shown with dashed lines. The links between the 
records for Allen and Boggs are broken, and reconnected so that Baggage 
sits in between them. The code to do this is as follows: 
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The function accepts a pointer to the object to be written, and returns the 
position of the new record. The idea is to go through the file records in 
physical sequence looking for the first record that has its Deleted flag set. 
If one is found, then the file position is backed up to that point, the new 
record is written to replace it, and the position where the record was 
written is returned. 


If no record in the file has been deleted then the new record is added to 
the end of the file. The last record is detected by checking the Next pointer 
of each record. The Next pointer in the last record will be -1. 


Testing the Add Capability 


There is enough here to allow us to test the program. We can create the file 
and add records to it. There are functions in main() that we haven't yet 
written, but we can replace each of them with a dummy function. For 
example: 





We can include a version of each of the functions we haven't yet produced 
which will just display a message. It would be judicious to include some 
assert() macro calls in the code we are testing, as well as some 
diagnostics of our own. A good general diagnostic for untested code is for 
each function to display a message when it is called, which you can 
surround with #if-#endif, controlled by a #define as we have seen 
previously. You can then compile and execute the program. Don't forget the 
header files that are needed for standard library functions. You must also 
remember to include a prototype for each function. 
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Deleting a Record 


The DeletePerson() function will delete a record from the file, and the 
process has a lot in common with the insert mechanism we've just looked 
at. The essential logic is very simple. We read the name for the record to be 
deleted, we search the file for the record, and if we find it, we delete it. If 
it isn't there then we display a message. The complications arise with the 
actual process of deleting a particular record. 


The code for the DeletePerson() function is: 
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After opening the file for update, all the action takes place within an 
infinite for loop to allow for an arbitrary number of successive delete 
operations to be requested. A name is read using the GetName() function 
we saw earlier, and the FindEntry() function is called to search the file for 
the record containing the name. The FindEntry() function has the 


prototype: 





The first argument is the name to be found, and the record corresponding 
to the name will be restored in the structure pointed to by the second 
argument. The position in the file of the record found is returned from the 
function, or it will return -1 if no record containing the name passed as the 
first argument exists. 


If a record is found, then it is checked to see whether it's the first or last 
record in the file. The first record always has the Previous pointer set to 


Finding 


a Record 





-1, and the last record has its Next pointer set to -1. If it's neither of these 
then it must be in the middle - a separate function deals with each case. 


| Once a search has been processed, or if a name wasn't found, then the user 
is prompted for another delete operation. If the response to this is negative 
then the function returns. 


Finding a Record 


The FindEntry() function that we used to find a record to be deleted, will 
also be used in the search operation to display a particular record. Finding 
a record corresponding to a particular name is quite straightforward. We 
read the file from the beginning in alphabetical sequence by following the 
Next pointers. As each record is read, the Name member is checked to see 
if it’s the one that we're looking for. If it is then we return its position. 





If the name we are looking for isn't in the file, then the process ends when 
a record is read with a name greater than the one sought, or we reach the 
last record in the file. The code to do this is as follows: 
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After getting to the beginning of the file, records are read by following the 
Next pointers. The first check is for the end of file being read, which is to 
deal with the possibility that someone is attempting to search a file that 
exists, but has never had any records written to it. The next check that it is 
essential for us to make is for an initial record with the Deleted flag set. 
This is to cover for the possibility of a file having had all its members 
deleted, and if we don't check this then we could end up wandering 
around deleted records in a file. If the first record is valid, then we will 
only be looking at valid records since we sever the links to deleted records. 


We then start looking for a name match in the current record. This uses a 
macro to test for strings being equal, and can be defined as: 





As with the macro sTRLT() which we saw earlier, it makes it a little more 
obvious what we are doing. 
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If we find the name, we copy the Person object from the record to the 
calling memory pointed to by the parameter pPerson, and return the file 
position. This is a very simplistic approach - it doesn't check the first name 
and doesn't allow for multiple file records with the same surname. This is 
an extension you might like to have a go at yourself. 


If we don't find an equal surname, then we check for the possibility that 
the surname we're looking for is less than the surname in the current file 
record. If it is, then there is no point in looking at the rest of the file since 
all subsequent records will be greater as the file is in ascending alphabetical 
sequence. We return a value of -1 to signal that the name wasn't found. 


If the current record name isn't greater than the name we're looking for, we 
pick up the file position of the next record from the Next pointer, and 
assuming that we're not at the last record, we cycle through the loop again. 


Deleting the First Record 


There's a lot of work to delete the first record because there must always 
be one unless all the records have been deleted. The original record already 
has the deleted flag set in the function DeletePerson(), so here we have to 
fix up the other records that may be affected. The code to execute this is as 
follows: 


f| CO e e ee e ee e e e ee e ee e e e e e he e e Ree e e e e e e e e e e e e e e e e e e o e e e 


* Function to delete the first record in the file 

* The argument is a pointer to a Person object which 

* has just been deleted from the file.This function moves 
* the second record to the first position, if it exists. 
* It also modifies the Previous pointer of the third | 
* record if it exists. | | D | 

VS Yee dee e e o ede de eee o de de e e e ee ee eee e e e eee e de ee / 
void DeleteFirst(Person *pPerson) | 

{ | | | mu 


5 9 4$ 4 & 


* 


Person bPerson; 

long fPos=-1L; 

if (pPerson->Next<0L) : | 
return; — "o | | /* No next one */ 


/* But there is a next one */ 
fseek(pFile, pPerson-»Next, SEEK SET); 
Read (£bPerson) ; a |. 4* Read next record */ 
bPerson.Previous=-1L; a  /* Set previous to none */ 
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If the first record is the only one in the file, then it's easy. We just set the 
deleted flag and write the record back. If there's more than one record in 
the file, then we need to read the second record, and copy it to the position 
occupied by the first after setting its Previous member to -1. Following 
this, we need to delete the original copy of the second record by setting its 
deleted flag, and writing it back to where it was. We then need to see if 
there was a third record. If there was, its Next pointer contains the wrong 
file position for what was the second record and has now been promoted to 
first. So we read the third record, set its Next pointer to 0, the position for 
the first record, and write it back in place. 


Deleting from the Middle 
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The record being deleted has already been fixed by the calling function 
DeletePerson(), so again we're going to fix other records that are affected. 
It is quite clear in this case. We need to modify the records either side of 
the one we have deleted. The code to do this is as follows: 
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Person bPerson; 

long fPos*-1L; 

/* Fix the Next pointer for previous record */ | | 
fseek(pFile, pPerson-»Previous, SEEK SET); /* Go to the previous */ 


£Pos=Read (&bPerson) ; /* and read it */ 
bPerson.Next=pPerson->Next ; /* Bypass the deleted T. 
fseek(pFile, fPos, SEEK SET); 2 /* Reposition */ 
WriteFile(&bPerson); /* and write it back */ 


/* Fix the previous pointer for the following record */ 
fseek(pFile, pPerson-»Next, SEEK SET); 


fPossRead(&bPerson); -~ ]4* Read next record */ 
bPerson. Previous=pPerson->Previous; /* Bypass deleted */ 
fseek(pFile, pPerson-»Next, SEEK "n, /* Reposition */ 
WriteFile(&bPerson); /* and write it back */ 


return; /* We are done */ 


This works out quite easily. We read the preceding record and fix the Next 
pointer to point to the following record. We then read the following record 
and change the Previous pointer to point to the preceding record - and 
that's it. 


Deleting the Last Record 


This is the simplest case of all. The only record to be affected is the second 
to last, which now becomes the last. It must exist if this function is called, 
because if it didn't, we would be deleting the first which is already taken 
care of. The code for this function is: 


[ERR ee e e ee ee e e e e e e e e e e e Re e e e e e e R0 e e kik eR He ee e 


* Function to delete the last record in the file » 
* The argument is the file position of the record preceding * 
* the last record which has been deleted so that it now — * 
* becomes the last. E Y 


O A AO EAR SEA ARR WWE, 
void DeleteLast(long fPos) 


{ 
Person aPerson; 
fseek(pFile, fPos, SEEK Set) /* Position to 2nd to last */ 
Read (&aPerson) ; /* and read it */ 
aPerson.Next=-1L;  /* There is now no next */ 
fseek(pFile, fPos, SEEK SET); |». 4* Reposition */ 
WriteFile(&aPerson); | . 2/* and write it back */ 
return; | 

) 
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This just reads the second to last record, sets the Next pointer to -1 so that 
it becomes the last record, and writes it back. 


arching for a Record 


We have done most of the work to search for a record with the previous 
function, FindEntry(). All we need in addition to this is the ability to 
display a record once we have found it. The code for the function Search() 
will be: 








Searching for 





a Record 





Display(&aPerson); : ea Qo A” so show details */ 
i | | 
printf("\nDo you want to search for eae 
scanf("*1s",ch); 


if((*ches'n')|| (*chss'N')) /* Check response for negative */ 
{ | ./* No more searches needed */ 
fclose(pFile); /* So close the file */ 
return; ; /* and return to caller */ 
) | 


After opening the file and making sure that we have a valid name to search 
for, the search process itself is very simple. We call the function 
FindEntry() to get hold of the record corresponding to the name entered, 
and we call the function Display() to display it. The operation takes place 
ina for loop to permit several successive searches to be made without 
necessitating going back to the set of choices in main(). 


Displaying a Record 


The function to display a record just writes the personal information 
contained in the Person object to stdout: 


fece hee hehehe e ee eee ce he e eee eee e e e e e e e à e n n 
* Function to display a Person record * 
* Argument is a pointer to the Person. * 


* There is no return value. * 
cde e e e e e je e e ee e he e eoe e ee oe e e e e e ee e e e e e Y 


void Display(Person *pPerson) 


{ 
int i=0; : | 7 /* Loop counter "7 
printf ("\n\nName: \t%s %s\nPhone: \t*s\naAddress:", 
pPerson->aName. Surname, pPerson-»aName.FirstName, pPerson-»Phone); 
for(i=0; i<ADDRLINES; i++) 
( ? : 
if(strlen(pPerson->Address[i])==0) /* Check for empty line */ 
return; /* If so we are done */ 
printf("\n\t%s", pPerson-»Address[i]); © 
} 
return; /* After ADDRLINES output we are done anyway */ 
) 


The address is displayed by the for loop, which ends the function when 
the first empty address line is found. If the address contains the full set of 
ADDRLINES lines, the return following the loop is executed. 
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Listing the File 


The last operation we support in the program is that of listing the entire 
contents of the file. This again is a simple process. We need to read the file 
from the beginning, displaying each record as we go. The code to do this is 
as follows: 





There's nothing new here. We read the file in the for loop to accommodate 
an arbitrary number of records. We then check that the first record isn't 
deleted. Each record is displayed using the function Display(), and the file 
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position for the next record is obtained from the Next pointer of the current 
record. The process stops when we find a Next value of -1, indication that 
there is no next record. 


The Program Header File 


We have discussed all the program code, so now we should come back to 
how it should be divided up between files. There's always a good case for 
a program header file, and sometimes several are appropriate if the program 
is large, or if it contains groups of functions which may not be needed by 
all the source files. Header files shouldn't contain anything that generates 
object code, but should accommodate all definitions required by the 
program. With our program one header file should suffice, and is listed in 
Appendix D. 


The include file contains all the definitions specific to the program as well 
as the include files for the standard library functions that are used. Of 
course, this file would need to be included at the beginning of all the 
source files making up the program. 


In case you have trouble gathering together all the bits that make up the 
program, a complete listing is provided in Appendix D. This listing is in 
three files, two source files, and a header file. It should compile and run as 
it is if you manage to type it all in correctly. 


Summary 


This wasn't a particularly complicated application, but nevertheless it led to 
a requirement for nineteen functions. This should have provided you with a 
little insight into how managing the source code for an application is a very 
important part of program development. If you have worked through the 
code step by step, you may well have introduced a few typos which will 
also demonstrate the value of diagnostic code in your program. 


The program is by no means fully developed. There is a lot of scope here 


for you to extend it, and to improve the code we have gone through. The 
whole process of dealing with names is very primitive, both in terms of 
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what you might realistically want to allow as a valid name, and in terms of 
the comparison mechanism. There are quite a few holes in the way 
interactions with the user are handled. In particular, only testing for ‘n’ or 
‘n’, and assuming that ‘y’ is otherwise entered isn't a very secure basis 
for input. You could also experiment with other strategies for storing the 
data, and searching the file. If there were a lot of records, a sequential 
search mechanism is going to be rather slow. 


If you've worked diligently through all the examples in the book, you have 
a good knowledge of C and should be reasonably competent at applying it. 
All you need to polish your skills is practice, and the more the better. Enjoy 
your programming. 


Exercises 
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1 Try to develop this personal address book system into a much more 
sophisticated application by implementing some of the following 
additions: 


Extend the search mechanism to allow a first name to be given as a 
search criterion in addition to the surname. 


Extend the search mechanism to allow any field to be used as a 
search criterion. 


Extend the search mechanism to retrieve multiple records for a given 
search criterion. 


Extend the list operation to provide a partial list capability, such as 
‘list the entries with surnames beginning with B or D’. 


Extend the Person structure to include additional fields, such as age 
or date of birth. Modify the search and list operations to incorporate 
any additional fields. 


Add a function to find, display and edit a record. 


Extend the delete function to enable the deletion of multiple records 
at once, based on single or multiple criterion. 











2 Create an application that will accept a text string from the keyboard 
and output an answer to simulate the response of human trapped 
inside your computer. You will need a variety of responses to 
anticipate what a user will ask. The output should simply be 
conversation based, like this (with user input in italics): 


Hello, my names Bob and I’m stuck inside your computer. 
Why? 

Because my parents named me after Robert Hope. 

Ha ha. 

You like my jokes - hey we'll get along just fine. 

What's your name? 


This type of dialogue could continue indefinitely, or at least until the user 
types 'Goodbye', 'Get lost' or something similar. 


Of course having responses ready for everything a user can type isn't very 
practical, but if you guide the user by asking questions and changing the 
subject then you can simulate a conversation with another human being. 
And remember, they don't have to be trapped inside your computer, you 
could produce responses for different scenarios. 


Prizes for the Best Conversation Simulator 


If you feel particularly happy with your conversation simulator and you feel that 
you would like to share it with us, then please send them in. We always welcome 
feedback, especially productive exercises that you've slaved over from our book. 
There are free books on offer to the best three we receive. 
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Where Do We Go From Here? 


You may or may not have had opinions on or ideas about the C language before 
you read this book, but we are sure that you've found this tour an enlightening, 
informative and eye-opening experience. Not only have we tried to remove the 
excess stuffy baggage that traditionally drags programming guides down, but 
we've also attempted to pack everything you need to know into a compact, cost- 
effective reference guide. 


There is no doubting the popularity of the C language - it has consistently been 
at the forefront of the programming revolution for the last 20 years. The vast 
back catalog of existing software and the number of new spin-off languages shows 
that your choice in learning C was a very wise one. Why don't you try some of 
our other titles such as Revolutionary OOP Using C++, a natural progression from 
Instant C, or perhaps Revolutionary Visual C++ if you would like to delve into 
Windows programming. Whatever path you choose to take, we are sure that this 
introduction to C has proved worthwhile. 


Now that you’ve had a taste of our refreshing style, would you like to know 
more about Wrox Press and our other publications? If you do then why don’t you 
ask for our latest catalog, or check out our Web page. And remember when 
you’re down at your local bookstore, look out for our distinctive red binding - 
your guarantee of Wrox value. 


Are you interested in writing or reviewing any of our future books? We warmly 
welcome any willing contributors that can help Wrox to publish even better books. 
If you’re interested then contact us right away - see the details at the back of this 
book. 


You can contact Wrox via the reply card also at the back of this book, or you can 
correspond with us by any of the following means: 


Snail mail Wrox Press Ltd, Unit 16, 20 James Road, 
Birmingham, B11 2BA, United Kingdom. 

Electronic mail (e-mail) johnf@wrox.demon.co.uk 

World Wide Web http: / /www.wrox.com 

CompuServe 100063,2152 

Telephone (44121) 706 6826 


Facsimile (44121) 706 2967 








Appendix 











Formatted Input/Output Summary 





Formatted Input 


Reading data from the standard input stream stdin, usually the keyboard, is 
provided by the standard library function scan£(), which has the prototype: 


i 


int scanf(const char *pFormat,...); | | 


The first parameter is a format string determining how data is to be read, 
and the subsequent arguments are pointers to variables which are to receive 
the input values. It is a common error to accidentally specify an argument 
that isn't a pointer, which usually leads to a program crash, since whatever 
is passed as an argument will be interpreted as a pointer. The integer value 
returned is the number of values read, or EOF if an error occurred. 


Conversion Specifiers 


Conversion specifiers determine the way in which input data is interpreted. 
Each pointer argument must correspond to a conversion specifier. The 
format specifiers must also be consistent with the type of variable being 
used to store the data. The function scanf() has no way to verify that this 
is the case, or that the number of format specifiers is equal to the number 
of pointer arguments. The conversion specifiers are all of the form: 
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The optional components of the conversion specifier are enclosed within 
square brackets to emphasise that you must always have the % character 
and a conversion character. If the * is present, then this indicates that 
the input data isn't to be stored in a variable, but should be skipped and 
the next input value read. 


The width, if present, is an integer specifying the maximum number of 
characters in the input field. 


The h, 1, specifies that the integer value is to be converted as short or 
long respectively. The L applies to the conversion of floating point values to 
type long double. 


Conversion Characters 


408 


Possible conversion character values and their corresponding meanings 
are shown here: 


For Reading Integer Values: 


d Decimal value of type int. 

i A value of type int that may be decimal, octal (with a 
leading 0), or hexadecimal (with a leading 0x or ox). 

u A decimal value of type unsigned int. 

x A hexadecimal value to be stored as type int (the 0x or 
ox can be omitted). 

o An octal value to be stored as type int (the 0 can be 
omitted). 

n Stores the count of the number of input characters read 


up to this point as a value of type int. 
For Reading Floating Point Values: 
e, f,or g A value of type float where a leading sign, a decimal 


point, and an exponent are all optional. The exponent may 
be written with a leading e or E, or a sign, or both. 


Formatted 
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For Reading Characters: 


c The characters specified by the width field, including 
whitespace characters, are stored as type char with no 
terminating “NO”. 


s A string of non-whitespace characters stored as type char 
- *1s will read the first character that isn't whitespace. 

% Specifies a % sign. Nothing is stored. 

[search_set] Successive characters are stored as a string of type char, 


as long as they belong to the characters specified by 
search set. For example, ?&[abc] will read a string 
consisting of only a, b, or c. The first character not in 
search set stops the process. 

[^search set]  Successive characters are stored in a string of type 
char as long as they are not included in search set. 


For Reading Pointer Values: 


p A pointer value is stored as type void *. The form of 
the input is implementation dependent. 


The scanf() Function 


The function scan£() will ignore blanks or tabs in the format string, but a 
sequence of characters other than % included in the format string, and not 
part of a format specification, indicates that the input should be matched to 
the specified characters. 


A generalized version of the scanf() function is available in the standard 
library, which has the prototype: 


int fscanf(FILE *pFile, const char *pFormat,...); 
The first argument is a pointer to a file stream. 


The function fscanf() operates in the same way with the same format 
specifiers as the scan£() function, and the scanf() function is equivalent to 
the fscanf() applied to stdout. 
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Formatted Input from Memory 


The standard library also provides a function to convert data stored in 
memory. Its prototype is: 





This also operates identically to scan£() except that the input data is 
obtained from the string pointed to by pstr. 


A Dos eei Roe ier n Gi vs oett P ee ge Se LES zi cim pt) ae ESSEN EE? MR 









| The standard library function for general formatted output has the 
prototype: 

A 

* 

E This will write formatted data to the file stream defined by the pointer 

1 y P 

4 pFile. The formatting of the output is controlled by the format string 

m . è 

1 pointed to by pFormat. A variable number of arguments can follow the 

E format string argument, and they are matched in sequence with the format 
l specifiers appearing in the format string. The number and type of the 
variables to be written must correspond with the format specifiers appearing 


in the format string. 


The function printf() is equivalent to the function £printf() applied to 
stdin. 


The general format of a format specifier for output is: 


= 
3 
3 
E 
3 
4 


%[flags] [width] [.precision] [h, or 1, or L] 





i conversion character 

E 

1 The elements shown between square brackets here are optional. 

a " 

E Conversion Characters 

J The conversion character specifies the type of the value to be output and 
3 how it is to be converted. The possible conversion characters are shown 
E. here: 

mE 

a 

E 

d 


vu 


per 
r 
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For Outputting Integer Values: 


d or i Output of a number of type int asa signed decimal 
value. 

o Output of a number of type int as an unsigned octal 
value. 

x Or X Output of a number of type int as an unsigned 


hexadecimal value. If x is used then a through £ are 
used as digits, and if x is used then A through F are 
used as digits. 


u Output of a number of type int as an unsigned 
decimal value. 
n The characters output up to this point are stored in the 


corresponding argument, which must be of type int *. 
For Outputting Floating Point Values: 


e Or E Output of a number of type float asa signed floating 
point value with an exponent. If e is used then the 
exponent is preceded by e, and if E is used then the 
exponent is preceded by E. 


£ Output of a number of type float as a decimal value 
without an exponent. 
g or G Output of a number of type float in e or £ form 


depending on the value. If G is specified and an 
exponent is necessary then it will be in E form. 


For Outputting Characters: 


c Output of a single character of type int after conversion 
to type unsigned char. 


8 Output of a sequence of characters specified by an 
argument of type char *. Output stops when a *‘\0’ 
character is found, or until the number of characters 
specified by precision have been output. 


36 Outputs a % sign - no corresponding variable is 
required. 
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For Outputting a Pointer: 


p Outputs a pointer of type void * in an implementation 
dependent form. 


. Flags 


The flags in the format specifier are optional, but if they are present, they 
affect how the output is presented. If more than one flag is specified then 
they can be in any order. They are as follows: 


: * Causes the output to be presented with a leading + or - 
i sign. 
1 - Causes the output to be presented left justified in its 
j field. Right justified output is the default. 
i space Causes the output to be presented with a leading space 
| if there's no leading sign. 

0 Causes the output for integer values to be presented 


with leading zeros. 

The effect of this flag depends on the conversion 

character. For o conversion the value will be preceded 

by 0. For x or x conversion the output will be preceded | 
by Ox or 0x. For floating point conversions the output | 
will always contain a decimal point. By default, no 
decimal point appears if the digits following it are zero. | 


Width Modifier 


The width element in the conversion specifier is optional, and determines a 
i minimum field width for output. It can be specified either as an integer 
value defining the minimum number of character positions, or it can be an 
asterisk, specifying that the next argument in the function call represents a 
field width value. The value must be of type int. 


Deb VES pna 
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If the output requires more characters than that defined by a width 
modifier, then the field is expanded to accommodate the output value. If 
the width is specified with a leading zero, then the output is padded with 
leading zeros to the left of the number. If a - flag is also used, the number 
is left justified and blanks are used instead. 


Precision Specification 


If a precision specification is present, it always begins with a period which 
acts as a separator between the width and precision specification. As in the 
case of a width value, it can be an integer value determining the number of 
digits of precision, or it can be specified as * indicating that the next 
argument in the function call specifies a value for the precision. The 
argument must be of type int. 


If the output needs more characters than specified by the precision, it may 
be truncated or rounded to fit the number of positions specified. 


Size Modifier 


The optional size modifiers h, or 1, or L, affect how the type of the 
arguments are interpreted. 


The modifier n only applies to conversion characters d, i, o, x, X, or u, and 
specifies that the argument is of type short int. 


The modifier 1 can be applied to conversion characters d, i, o, x, X, or u, 
and when present specifies that the argument is of type 1ong, or it can be 
applied to conversion characters of type e, E, £, g, or G, where it specifies 
that the argument is of type double. The modifier L only applies to 
conversion characters of type e, E, £, g, or G, where it specifies that the 
argument is of type long double. 
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Formatted Output to Memory 


Analogous to the sscanf() function, the standard library provides a 
function to output formatted data to a user defined buffer in memory. Its 


prototype is: 





Apart from the fact that output is to the memory area pointed to by the 
first parameter, pstr, this function operates identically to the fprintf£() 
function. The output generated is terminated by '10'. The count of output 
characters returned, doesn't include the terminating ‘\0’. 
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The ASCII Table 








The American Standard Code for Information Interchange or ASCII assigns 
values between 0 and 255 for upper and lower case letters, numeric digits, 
punctuation marks and other symbols. ASCII characters can be split into the 
following sections: 


0-31 Control functions 
32-127 Standard, implementation-independent characters 


128 - 255 Special symbols, international character sets - generally, 
non-standard characters. 


Since the latter 128 characters are implementation-dependent and have no 
fixed entry in the ASCII table, we shall only cover the first two groups in 
the following table: 
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ASCII Characters O - 31 


Decimal Hexadecimal Character Control 


NUL 

SOH 

STX 

ETX 

EOT 

ENQ 

ACK 

BEL (Audible bell) 
Backspace 

HT 

LF (Line feed) 
VT (Vertical feed) 
FF (Form feed) 
CR (Carriage return) 
SO 

SI 

DLE 

DC1 

DC2 

DC3 

DC4 

NAK 

SYN 

ETB 

CAN 

EM 

SUB 

ESC (Escape) 

FS 


| 


e >z © € 90 


GS 
RS 
US 
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ASCII Characters 32 - 127 


Decimal Hexadecimal Character Decimal Hexadecimal Character 


/ 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 


PSM CK S ACP AOVOAZSZTA "nomumogngg»e 


"1 





Continued 


419 


Appendix B - The ASCII Table 


Decimal Hexadecimal Character Decimal Hexadecimal Character 


q 
r 
S 
t 
u 
V 
w 
x 
y 
z 
| 
| 
| 


Goes NOA 20900 





ASCII Characters 128 - 255 


The ASCII characters between 128 and 255 are system-dependent, so we 
| can't print them here. What we can do though is give you a program 
which will print out the all the codes between 32 and 255 on your machine: 
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a 
3 
Fo 
A 
g 
E 
E 
5 
EH 
3 
E 
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Keywords in C 





Keywords are reserved words that you can't use as identifiers. ANSI 
standard C has 32 keywords defined: 


auto int 
break long 
case register 
char return 
const short 
continue signed 
default sizeof 
do static 
double struct 
else switch 
enum typedef 
extern union 
float unsigned 
for void 
goto volatile 
if while 


The significance of the case of these keywords is important. For example, 
the identifiers WHILE, EXTernal and unSIGNed will not be recognized as 
keywords. 
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The Address Book Source Code 





In Chapter 11 we developed the address book program consisting of three 
files: 


Map The header file mYHEADER.H 
p v The source file EXx11-01.c 


am) The source file FNCTNS.C 
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MYHEADER.H 





MYHEADER.H 
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EX11-01.C 











Insert (pPerson); /* File is open so insert person */ 
) 

) 

else 

{ /* Flag is set so the file exists */ 
pFilesfopen(pFPersons,"rb*");/* Open it for update */ 
if(pFilezssNULL) 

( /* Open failed so drop out */ 
printf£("\nUnable to open Persons file. Program ended."); 
exit(1); 

) 

Insert (pPerson); /* Insert the new Person object */ 

} f 
Free(pPerson); /* Write is done so release memory */ 
fclose(pFile); /* and close the file */ 


printf("\nDo you want to add another person(y or n)? "Jj 

scanf ("*1s",ch); | 

idcm /* Want to add another? */ 
nie | /* No, so return */ 


DETTE ELLE ELE 
_ * Function to delete a person from the file * 
We ee oe ee e e Ho R0 RARA RARE NARA e e eJ 

void DeletePerson(void) 

{ 
Person aPerson; /* Record to be deleted */ 
Name aName; 
long fPoss-1L; 
char *ch="n"; 


pFilesfopen(pFPersons,"rb*"); /* Open file for update */ 
if (pFile==NULL) 
t 
printf£("\nUnable to open Persons file. Program ended."); 
exit(1); 
} 
for (77) 
{ 
if (!GetName (&aName) ) 
{ 
printf ("inIinvalid name entered. Delete aborted."); 
fclose(pFile); /* So close the file */ 
return; 
} 
if((fPossFindEntry(&aName, &aPerson) )==-1L) 
( | 
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Symbols 


. 186 

-- 43,44 

9o 43 

* 43 

+ 43 

++ 43, 44 

- 43 

/ 43,44 

? 77 

! 76 

l= 68 
#define 105, 310, 314 
#elif 328 
#else 327 
#endif 323 
#error 327 
#if 323 
#ifdef 325 
#ifndef 324 
#include 234, 310 
& 117 

&& 75 





INSTANT 





Index 





« 68 
<= 68 
z 68 
> 68 
>= 68 
la 76 


A 


acos() function 254 
AddPerson() function 372 
address operator 
pointers 117 
AddTen() function 155 
AND 
(&&) logical operator 75 
bitwise operator 51 
ANSI identifiers. See naming conventions 
argument 13 
array as argument to function 156 
command-line 172 
pointer to a function as 170 
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Index 


argument list 
printing text and variables 41 
arithmetic expressions 43 
associativity 44 
constant 47 
exercise 46 
floating point 48 
operators 
-- 43, 44 
96 43 
* 43 
* 43 
++ 43, 44 
- 43 
/ 43, 44 
syntax 50 
parentheses 45 
precedence 43 
arrays 100 
as function arguments 156 
character 105 
declaring 100 
elements of 101 
example 103 
flexibility 
sizeof 115 
incomplete rows 112 
index value 101 
initializing 103 
memory storage 102, 111 
multi-dimensional 110 
initialization 112 
naming 125 
of structures 196 
one-dimensional 110 
pointer arithmetic 123 
pointers 120 
ASCII 34. See also Appendix B 


asin() function 254 
assert() function 331 
ASSERT.H 234 
assertions 
debugging with assert macro 331 
assignment statements 32, 45 
casting 56 
multiple 45 
op- 49 
arithmetic expressions 44 
operator precedence 59 
atan() function 254 
atan2() function 254 
atoi() function 251 
atol() function 251 
automatic variables 59 
local/block scope 59 


binary mode 281 
example file 283 
checking for a divisor 287 
collating the parts 289 
outputting primes 288 
validating a prime 285 
specifying 281 
binary operators 
precedence 59 
binary trees 219 
bitwise operators 
AND 51 
masking 52 
exclusive OR 53 
NOT 53 
OR 52 





block scope. See automatic variables 
blocks 11 

Boolean types 39 

break statement 79 

buffer 266 

Buffer() function 286 


C 
C 


characteristics of 8 
compiling a program 17 
developing a program 365 
executing a program 17 
maintainability 357 
operating system effects 23 
portability 343 
program decision making 68 
structure of a program 10 
C++ 9 
calloc() function 132, 136 
cascading errors 21 
case 
switch statement 81 
casting 
assignment statements 56 
explicit 57 
syntax 57 
variables and constants 54 
conversion rules 54 
ceil() function 255 
character arrays 
storing multiple strings 113 
string handling 105 
diagram 106 
string input 107 
string reading 107 





character classification functions 235 
CTYPE 235 
character variables 
char 34 
declaring 34 
example 35 
Check() function 286 
CheckName() function 379 
clearerr() function 304 
clock() function 236 
comments 14 
good practice 15 
compiling a program 17 
diagram 17 
editing 18 
text files 18 
error generation 20 
environment help 22 
revision 21 
linking 17 
text editors 19 
conditional operator 
d AL 
example 77 
syntax 77 
constants 
enumeration 40 
named 38 
continue statement 89 
example 89 
flow diagram 90 
cos() function 254 
cosh() function 254 
CPU_timer() function 241 
CreateList() function 208 
CreateTree() function 220 
CTYPE.H 234 
int 235 


449 


————————— 


——————— 


Index 


D 


data structures 181 
member access through pointers 197 
and functions 198 
creating structures 198 
managing memory for dynamic 
structures 202 
arrays of 196 
as function arguments 188 
as return values 189 
data organisation 204 
linked lists 204 
declaring 182 
variables 183 
variables and structures together 183 
implementing 185 
initializing 184 
referring to members 186 
structure tag 183 
tree 217 
typedef 184 
unions 226 
using pointers with 197 
data types 30 
defining your own names 
typedef 63 
floating point 48 
integers 30 
declaration statements 30 
long 30 
short 31 
date and time functions 
processor time 236 
debugging 330 
the assert macro 331 
pointer problems 332 
removing diagnostic statements 332 
declaration statements 30 
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decrement operators 
arithmetic expressions 44 
defining declaration 34 
Delete() function 202 
DeleteList() function 216 
DeletePerson() function 391 
DeleteTree() function 226 
delimiters 
analyzing a string 248 
dereferencing 
pointers 117, 124 
developing a program 
defining the problem 366 
general logic 370 
header file 401 
managing the application data 367 
name structure 367 
person structure 368 
structuring 366 
the person file 369 
adding a person 372 
adding a record 
to the beginning 386 
to the end 389 
to the middle 387 
creating a person 377 
deleting a record 391 
from the beginning 395 
from the end 397 
from the middle 396 
displaying a record 399 
finding a record 393 
inserting a person 380 
listing the file 400 
reading a record 385 
searching for a record 398 
testing the add capability 390 
writing a record 379 
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difftime() function 239 
directives 
#define 
disregarding context 315 
macro substitutions 315 
nesting substitutions 314 
Display() function 293, 399 
DisplayList() function 210 
DisplayTree() function 224 
divide operator 44 
do-while loop 94 
example 95 
flow diagram 95 
syntax 95 
domain error 
mathematical functions 255 
double variables 37 
doubly linked lists 211 
dynamic memory allocation 
extending memory areas 138 
pointers 132 
the heap 132 
calloc() 132 
free() 132 
malloc() 132 
realloc() 132 


E 
editing 18 


errors 18 

text editors 19 
editing environments 19 
efficiency 9 
ElapsedTimer() function 241 
end of file 266 
enumeration 39 





assigning specific values 40 
constants 40 
defining Boolean values 40 
EOF. See end of file 
EOR. See exclusive OR 
ERRNO.H 234 
domain error 255 
error generation 20 
cascading errors 21 
revision 21 
escape sequences 
printing text and variables 42 
exclusive OR 
bitwise operator 53 
executing a program 
compiling 17 
linking 22 
run option 23 
exit() function 331 
exp() function 255 
explicit casting 57 
syntax 57 
exponent 36 
external variables 176 


E 


fab() function 255 
factorial() function 161 
False 

decision making 69 
feof() function 302 
ferror() function 302 
fflush() function 280 
fgetpos() function 292 
file mode 

mode specifiers 264 
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file operations 261 


buffering 266 
diagram 266 
closing 269 
concept of a file 262 
diagram 262 
creating a unique filename 301 
file error functions 302 
clearerr 304 
error numbers 303 
feof() 302 
ferror() 302 
printing error message 303 
input/output 
flushing an output buffer 280 
formatted 275 
unformatted 281 
modes 279 
moving around in a file 289 
file positioning operations 290 
finding out where you are 290 
setting a position 291 
opening 263 
fopen() 263 
pushing a character back 268 
reading a string from a file 273 
reading characters from a file 
fgetc() 267 
reading from a binary file 282 
specifying binary mode 281 
temporary work files 300 - 301 
creating 300 
unique file creation example 301 
updating 280 
writing a binary file 281 
writing a string to a file 272 
writing characters to a file 266 


end of file 266 
fputc() 266 
file scope 
global variables 61 
FindEntry() function 392 
flexibility 9 
FLOAT.H 234 
floating point 
types 49 
variables 36 - 37 
exponent 36 
mantissa 36 
with loops 91 
floor() function 255 
fmodf() function 255 
fopen() function 263 
forloop 84 
infinite 86 
syntax 84 
format specifiers 
length modifiers 42 
printing text and variables 41 
time and date functions 237 
format string 
printing text and variables 41 
fprintf() function 276 
fputc() function 266 
fputs() function 272 
fread() function 282, 286 
free() function 132 
frexp() 255 
fscanf() function 279 
fseek() function 291 _ 
fsetpos() function 291 
ftell() function 290 
function 
body 12,147 
call 144 
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definition 144 

execution 12 

header 144, 146 

name 144 

parameter naming 149 

pass-by-value 153 

passing arguments to 151, 153 

passing multi-dimensional arrays to 159 

pointer as argument to 154 

pointers to 167 - 170 

private 176 

prototype 149 

recursive 164 

returning values from 159 

static variables in 163 
function() function 175 
functions 12 

acos() 254 

AddPerson() 372 

AddTen() 155 

asin() 254 

assert() 331 

atan() 254 

atan2 () 254 

atoi() 251 

atol() 251 

Buffer() 286 

calloc() 132 

ceil() 255 

CheckName() 379 

clearerr() 304 

clock() 236 

cos() 254 

cosh() 254 

CPU Timer() 241 

CreateList() 208 

CreateTree() 220 

Delete() 202 





DeleteList() 216 
DeletePerson() 391 
DeleteTree() 226 
difftime() 239 
Display() 293, 399 
DisplayList() 210 
DisplayTree() 224 
ElapsedTimer() 241 
exit() 331 

exp() 255 

fab() 255 
factorial() 161 
feof() 302 
ferror() 302 
fflush() 280 
fgetpos() 292 
FindEntry() 392 
floor() 255 
fmodf() 255 
fopen() 263 
fprintf() 276 
fputc() 266 
fputs() 272 
fread() 282, 286 
free() 132 

frexp() 255 
fscanf() 279 
fseek() 291 
fsetpos() 291 
ftell() 290 
function() 175 
fwrite() 281 
getchar() 109 
GetPhone() 205 
GetPoint() 195 
gets() 107 
Insert() 293 
InsertNode() 220 
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Intersection() 195 
isalnum() 235 
isalpha() 235, 379 
iscntrl() 235 
isdigit() 235 
islower() 235 
isprint() 235 
ispunct() 235 
isspace() 235 
isupper() 235 
Idexp() 255 
localtime() 236 
log() 255 

main() 13 
malloc() 132 
memcpy() 250 
memmove() 250 
memset() 249 
myfun() 177 
Parallel() 203 
perror() 303 
pow() 255 
power() 149 
printf() 41. See also Appendix A 
PutBuffer() 285 
putc() 266 

puts() 272 
ReadPeople() 293 
ReadPerson() 372 
realloc() 132 
remove() 302 
rewind() 304 
scanf() 48. See also Appendix A 
Search() 398 
ShowNumber() 222 
sin() 254 

sinh() 254 
sprintf() 254 


sqrt() 255 
sterror() 303 
strcat() 242 
stremp() 242 
strcpy() 242 
strcspn() 246 
strlen() 242 
strncat() 243 
strpbrk() 247 
strspn() 245 
strtod() 250, 253 
strtok() 248 
strtol() 250, 252 
sum() 168 
tan() 254 
tanh() 254 
TestPrime() 286 
time() 236 
tmpnam() 301 
toupper() 235 
treble() 12 
ungetc() 268 
WriteFile() 374 
fwrite() function 281 
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garbage value 34 
generic sizes 
sizeof 115 
getchar() function 109 
GetLine() function 195 
GetPhone() function 205 
GetPoint() function 195 
gets() function 107 
global variables 61 
goto statement 82 
statement label 82 
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header file 17. See also library files 
heap 

calloc() 132 

free() 132 

malloc() 132 

realloc() 132 
Hungarian notation. See naming 

conventions 


identifier. See naming conventions 
if statement 70 

diagram 70 

nesting 72 
if-else statement 72 

nesting 72 

if-else ownership 73 

include file 17. See also library files 
increment operators 44 
index value 

array 101 

use of 101 
indirect member selection operator 198 
indirection operator 

pointers 117 
infinite loops 

for 86 

while 94 
initializing variables 33 

defining declaration 34 

garbage value 34 

stderr 16 

stdin 16 

stdout 16 





input/output 
formatted 407 
conversion characters 408, 411 
conversion specifiers 407 
flags 412 
input from memory 410 
output to memory 414 
precision specification 413 
scanf() 409 
size modifier 413 
width modifier 412 
stderr 16 
stdin 16 
stdout 16 
tips 
logic and loops 88 
storing multiple strings 113 
Insert() function 293 
InsertNode() function 220 
int 
CTYPE.H 235 
integers 
as data type 30 
constants 32 
type modifiers 36 
signed 36 
unsigned 36 
variables 
long 30 
short 31 
Intersection() function 195 
isalnum() function 235 
isalpha() function 235, 379 
iscntrl() function 235 
isdigit() function 235 
isgraph() function 235 
islower() function 235 
isprint() function 235 
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ispunct() function 235 
isspace() function 235 
isupper() function 235 
isxdigit() function 235 
iteration variables 91 


K 


keywords 15. See also Appendix C 
char 34 
const 38 
double 37 
else 71 
examples 15 
extern 176 
float 37 
int 30 
long 30 
register 62 
short 31 
signed 36 
sizeof 114 
static 62 
struct 182, 263 
typedef 63 
unsigned 36 
void 147 
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Idexp() function 255 
length modifier 
printing text and variables 42 
libraries 
include file (header file) 17 
library files 16 
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library files 16 
ASSERT.H 234 
CTYPE.H 234 
ERRNO.H 234 
FLOAT.H 234 
LIMITS.H 234 
LOCALE.H 234 
MATH.H 234 
SETJMP.H 234 
SIGNAL.H 234 
STDARG.H 234 
STDDEFH 234 
STDIO.H 32, 234 
STDLIB.H 132, 234 
STRING.H 234 
TIME.H 234 

LIMITS.H 234 

linked list 
adding an object 

to the head 213 

to the end of 215 

to the middle of 215 
cleaning up the heap 206 
creating 208 
data entry 205 
data organisation 204 
doubly linked lists 211 
displaying 206 

linking 
creating an executable file 22 
errors 23 

local scope. See automatic variables 

LOCALE.H 234 

localtime() function 236 

log() function 255 

logical operators 75 
AND (&&) 75 
in combination 76 
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NOT (!) 76 

OR (| |) 76 

long data type 30 
double variables 38 
loop 

definition 83 
do-while 94 
example 83 

for 84 

iteration and floating point 91 
iteration variables 91 
while 92 


macro 
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random number generation 
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scalar variables 100 
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scope 

with variables 59, 173 
Search() function 398 
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